Skip to content

Follow-up from "Add brc4env-epcoll test file."

The following discussion from !138 should be addressed:

  • @raphael.flores started a discussion: (+1 comment)

    Well not at all, I'm sorry.

    There are few (4) duplicates in Florilège file rare_pilier-plante_florilege.utf8.json.gz:

    rflores@urgi131:~/local/git/data-discovery/data/rare (data/add-brc4env-epcoll-data *$%=) $ diff /tmp/id.sort /tmp/id.sort.uniq 
    16498d16497
    < "CRB Prunier - Madame Bonnard"
    16503d16501
    < "CRB Prunier - Mirabelle de Nancy"
    16516d16513
    < "CRB Prunier - Prune de Chien"
    39172d39168
    < "Sorgho - France"

    The first one looks to have all same properties, but not all others:

    rflores@urgi131:~/local/git/data-discovery/data/rare (data/add-brc4env-epcoll-data *$%=)$ jq '.[]|select(.identifier=="CRB Prunier - Madame Bonnard")' $file
    {
      "pillarName": "Pilier Plante",
      "databaseSource": "Florilege",
      "portalURL": "http://florilege.arcad-project.org",
      "identifier": "CRB Prunier - Madame Bonnard",
      "name": "Madame Bonnard",
      "description": "Traditional cultivar/landrace",
      "dataURL": null,
      "domain": "Plantae",
      "taxon": "Prunus domestica",
      "family": null,
      "genus": null,
      "species": null,
      "materialType": null,
      "biotopeType": null,
      "countryOfOrigin": "FRA",
      "countryOfCollect": null,
      "locationOfOrigin": null,
      "locationOfCollect": null
    }
    {
      "pillarName": "Pilier Plante",
      "databaseSource": "Florilege",
      "portalURL": "http://florilege.arcad-project.org",
      "identifier": "CRB Prunier - Madame Bonnard",
      "name": "Madame Bonnard",
      "description": "Traditional cultivar/landrace",
      "dataURL": null,
      "domain": "Plantae",
      "taxon": "Prunus domestica",
      "family": null,
      "genus": null,
      "species": null,
      "materialType": null,
      "biotopeType": null,
      "countryOfOrigin": "FRA",
      "countryOfCollect": null,
      "locationOfOrigin": null,
      "locationOfCollect": null
    }

    OK

    rflores@urgi131:~/local/git/data-discovery/data/rare (data/add-brc4env-epcoll-data *$%=) $ jq '.[]|select(.identifier=="CRB Prunier - Mirabelle de Nancy")' $file
    {
      "pillarName": "Pilier Plante",
      "databaseSource": "Florilege",
      "portalURL": "http://florilege.arcad-project.org",
      "identifier": "CRB Prunier - Mirabelle de Nancy",
      "name": "Mirabelle de Nancy. Collect location: Lucey - Meurthe-et-Moselle (54) - France",
      "description": "Traditional cultivar/landrace",
      "dataURL": null,
      "domain": "Plantae",
      "taxon": "Prunus insititia",
      "family": null,
      "genus": null,
      "species": null,
      "materialType": null,
      "biotopeType": null,
      "countryOfOrigin": "FRA",
      "countryOfCollect": null,
      "locationOfOrigin": {
        "lat": 48.7225,
        "lon": 5.8375
      },
      "locationOfCollect": null
    }
    {
      "pillarName": "Pilier Plante",
      "databaseSource": "Florilege",
      "portalURL": "http://florilege.arcad-project.org",
      "identifier": "CRB Prunier - Mirabelle de Nancy",
      "name": "Mirabelle de Nancy. Collect location: Thélod - Meurthe-et-Moselle (54) - France",
      "description": "Traditional cultivar/landrace",
      "dataURL": null,
      "domain": "Plantae",
      "taxon": "Prunus insititia",
      "family": null,
      "genus": null,
      "species": null,
      "materialType": null,
      "biotopeType": null,
      "countryOfOrigin": "FRA",
      "countryOfCollect": null,
      "locationOfOrigin": {
        "lat": 48.5467,
        "lon": 6.04528
      },
      "locationOfCollect": null
    }

    locationOfOrigin is differing.

    rflores@urgi131:~/local/git/data-discovery/data/rare (data/add-brc4env-epcoll-data *$%=) $ jq '.[]|select(.identifier=="CRB Prunier - Prune de Chien")' $file
    {
      "pillarName": "Pilier Plante",
      "databaseSource": "Florilege",
      "portalURL": "http://florilege.arcad-project.org",
      "identifier": "CRB Prunier - Prune de Chien",
      "name": "Prune de Chien. Collect location: Sabres - Landes (40) - France",
      "description": "Traditional cultivar/landrace",
      "dataURL": null,
      "domain": "Plantae",
      "taxon": "Prunus domestica",
      "family": null,
      "genus": null,
      "species": null,
      "materialType": null,
      "biotopeType": null,
      "countryOfOrigin": "FRA",
      "countryOfCollect": null,
      "locationOfOrigin": {
        "lat": 44.1497,
        "lon": -0.739167
      },
      "locationOfCollect": null
    }
    {
      "pillarName": "Pilier Plante",
      "databaseSource": "Florilege",
      "portalURL": "http://florilege.arcad-project.org",
      "identifier": "CRB Prunier - Prune de Chien",
      "name": "Prune de Chien. Collect location: Sabres - Landes (40) - France",
      "description": "Traditional cultivar/landrace",
      "dataURL": null,
      "domain": "Plantae",
      "taxon": "Prunus domestica",
      "family": null,
      "genus": null,
      "species": null,
      "materialType": null,
      "biotopeType": null,
      "countryOfOrigin": "FRA",
      "countryOfCollect": null,
      "locationOfOrigin": {
        "lat": 44.1497,
        "lon": -0.739167
      },
      "locationOfCollect": null
    }

    Looks OK.

    rflores@urgi131:~/local/git/data-discovery/data/rare (data/add-brc4env-epcoll-data *$%=) $ jq '.[]|select(.identifier=="Sorgho - France")' $file
    {
      "pillarName": "Pilier Plante",
      "databaseSource": "Florilege",
      "portalURL": "http://florilege.arcad-project.org",
      "identifier": "Sorgho - France",
      "name": "AGEN",
      "description": null,
      "dataURL": null,
      "domain": "Plantae",
      "taxon": "Sorghum bicolor",
      "family": null,
      "genus": null,
      "species": null,
      "materialType": null,
      "biotopeType": null,
      "countryOfOrigin": "NGA",
      "countryOfCollect": null,
      "locationOfOrigin": null,
      "locationOfCollect": null
    }
    {
      "pillarName": "Pilier Plante",
      "databaseSource": "Florilege",
      "portalURL": "http://florilege.arcad-project.org",
      "identifier": "Sorgho - France",
      "name": "TIGNE",
      "description": null,
      "dataURL": null,
      "domain": "Plantae",
      "taxon": "Sorghum bicolor",
      "family": null,
      "genus": null,
      "species": null,
      "materialType": null,
      "biotopeType": null,
      "countryOfOrigin": "SEN",
      "countryOfCollect": null,
      "locationOfOrigin": null,
      "locationOfCollect": null
    }

    countryOfOrigin is differing