Citrine: Template CSV (.csv)

Ingester Description
The Template CSV ingester can be used to upload files that have been created to comply with Citrine's .csv template format. This template makes use of recognized keywords in the header to direct the ingestion of data.
Example file
To see an example file please click here.
File creation instructions
The .csv template has been designed to store information about materials, how they were made, and their properties. This can be done by using recognized keywords in the column headers, and entering data into those columns.

The first row should be header names, and each subsequent row should represent a different material. That material's information (name, formula, properties, etc.) is entered underneath the appropriate column header in that row.

Note on Microsoft Excel files: Please convert .xls or .xlsx files to .csv format before uploading them to Citrination.
Keywords
  • NAME
  • FORMULA
  • IDENTIFIER
  • CLASSIFICATION
  • UID
  • IDEAL COMPOSITION
  • ACTUAL COMPOSITION
  • REFERENCE
  • PREPARATION STEP NAME
  • PREPARATION STEP DETAIL
  • PROPERTY
  • CONDITION
  • ALL CONDITION
  • FIGURE NUMBER
  • FIGURE CAPTION
  • TABLE NUMBER
  • TABLE CAPTION
  • METHOD
  • DATA TYPE
  • IDEAL QUANTITY
  • ACTUAL QUANTITY
  • FILE
  • SUBSYSTEM
Header format
General header format:
KEYWORD: Name (Units)
Important information
Except for FORMULA, you can add as many or as few columns as you need with each keyword. Only one formula column should appear in any spreadsheet.

At least one NAME, FORMULA or IDENTIFIER should be present for each row.

Any cell can be left blank if it is not appropriate to enter data for that row.

After all keywords except NAME, FORMULA and PREPARATION STEP NAME you can specify the name of the property, condition, step detail etc.

Names should not contain parentheses or colons except to separate keywords from column names and to denote units.

Only .CSV files can be ingested with this converter. Please save all .XLS and .XLSX files as .CSV files, using the "Save as" option and then upload these .CSV files to Citrination.
Details
NAME
The name column should contain the name of the material referenced in that row.

Example:
NAME
Sodium chloride

FORMULA
The formula column should contain the chemical formula of the material referenced in that row.

*There can only be 1 formula column per row.

Example:
FORMULA
NaCl

IDENTIFIER

The identifier column should contain the identifier of the material referenced in that row. Identifiers can be named by entering the name to use after a colon in the header row. If a name is not provided this will default to ‘ID’. The identifier for the material referenced in each row can then be provided in that column.

Example:
IDENTIFIER: Sample Number
1234

CLASSIFICATION
The classification column should contain the classification of the material referenced in that row. Classifications can be named by entering the name to use after a colon in the header row. The classification for the material referenced in each row can then be provided in that column.

Example:
CLASSIFICATION: Component type
Amine

UID

When used, the UID column should contain a unique identifier for the sample referenced in that row. Currently, each UID can only occur once in a .csv. If the same UID is given again then that record will replace the previous one. UIDs must be unique within a dataset across all files. All non-alphanumeric characters will be stripped out of the UID field.

*There can only be 1 UID column per row.

Example:
UID
Sample10101a

IDEAL COMPOSITION

The ideal composition columns should contain information about the ideal or nominal composition of the material. Each column should contain information on one element and the element can be specified by entering the element symbol after a colon in the header row. The unit for that column should then be specified in parenthesis as atomic or weight percent (at% or wt%).

Example:
IDEAL COMPOSITION: Al (wt%)
90

ACTUAL COMPOSITION

The actual composition columns should contain information about the measured or actual composition of the material. Each column should contain information on one element and the element can be specified by entering the element symbol after a colon in the header row. The unit for that column should then be specified in parenthesis as atomic or weight percent (at% or wt%).

Example:
ACTUAL COMPOSITION: Al (wt%)
90.51

REFERENCE

The reference column should contain information about the source of the data in that row. References can be named by entering the name to use after a colon in the header row. Some examples of supported names are doi, title, author, publisher, url, isbn, journal, volume, issue, year. It is not necessary to specify a name for the reference column if this is not appropriate.

Example:
REFERENCE: doi
10.101010/test01

PREPARATION STEP NAME

Preparation step name columns should contain information about the names of the steps that were used to create a material. The order of preparation columns will be preserved and appear in Citrination in the order in which they have been entered (left to right).

Example:
PREPARATION STEP NAME
Annealing

PREPARATION STEP DETAIL

Preparation step detail columns should contain additional information about the preparation step that is mentioned in the PREPARATION STEP NAME column to the left. Preparation step detail columns can only be used when a preparation step name has already been specified. Preparation step detail names can be named by entering the name to use after a colon in the header row. Units can be specified in parentheses. For preparation step details, names are required and units are optional.

Example:
PREPARATION STEP DETAIL: Annealing temperature (K)
700
If you need to store multiple values in the cell, these should be enclosed in square brackets and separated by commas.

Example:
PREPARATION STEP DETAIL: Annealing temperature (K)
[500, 600, 700]

PROPERTY

Property columns should contain information about properties that have been measured, calculated etc. Properties can be named by entering the name to use after a colon in the header row. Units can be specified in parentheses. For properties, names are required and units are optional.

Example:
PROPERTY: Density (g/cm^3)
2.73
If you need to store multiple values in the cell, these should be enclosed in square brackets and separated by commas.

Example:
PROPERTY: Density (g/cm^3)
[2.73, 2.50, 2.49]

CONDITION

Condition columns should contain information about conditions associated with a PROPERTY given in the column to the left. It is possible to have one property with multiple conditions (e.g. Pressure, temperature). All condition columns will be associated with the nearest property column to their left. Conditions should be named by entering the name to use after a colon in the header row. Units can be specified in parentheses. For conditions, names are required and units are optional.

Example:
CONDITION: Pressure (kPa)
101
If you need to store multiple values in the cell, these should be enclosed in square brackets and separated by commas.

Example:
CONDITION: Pressure (kPa)
[101, 202, 303]

ALL CONDITION

All condition columns should contain information about conditions associated with all of the PROPERTY columns in the file. Conditions should be named by entering the name to use after a colon in the header row. Units can be specified in parentheses. For conditions, names are required and units are optional.

Example:
ALL CONDITION: Pressure (kPa)
101
If you need to store multiple values in the cell, these should be enclosed in square brackets and separated by commas.

Example:
ALL CONDITION: Pressure (kPa)
[101, 202, 303]

METHOD
Method columns should contain information about method used to obtain the PROPERTY information given in the column to the left. It is possible to have one property with multiple methods. All method columns will be associated with the nearest property column to their left. Methods do not need a name in the header row and no units should be specified for methods.

Example:
METHOD
DFT

DATA TYPE

Data type columns should contain information about data type of the PROPERTY given in the column to the left. Each property can have only one data type. All data type columns will be associated with the nearest property column to their left. Data types do not need a name in the header row and no units should be specified for data type. Options for data type are: (1) Experimental, (2) Computational or (3) Machine Learning.

Example:
DATA TYPE
Experimental

FIGURE/TABLE NUMBER

FIGURE/TABLE NUMBER columns should contain the original source figure/table number for the PROPERTY given in the column to the left. Each property can have only one data figure/table number reference. All figure/table number columns will be associated with the nearest property column to their left. Figure/table number columns do not need a name in the header row and no units should be specified for figure/table number. Numbers can be numeric, alphanumeric or roman numerals.

Example:
FIGURE NUMBER
1

FIGURE/TABLE CAPTION

FIGURE/TABLE CAPTION columns should contain the original source figure/table number for the PROPERTY given in the column to the left. Each property can have only one data figure/table caption. All figure/table caption columns will be associated with the nearest property column to their left. Figure/table caption columns do not need a name in the header row and no units should be specified for figure/table caption.

Example:
FIGURE CAPTION
Stress strain plot for material A.

IDEAL QUANTITY

Ideal quantity columns should contain information about quantity or amount of the system. This would be most commonly used with subsystems where there are different quantities of the subsystems that make up the main system. Ideal quantity supports units mass percentage (mass), number percentage (number) and volume percentage (volume).

Example:
IDEAL QUANTITY (mass)
80

ACTUAL QUANTITY

Actual quantity columns should contain information about quantity or amount of the system. This would be most commonly used with subsystems where there are different quantities of the subsystems that make up the main system. Actual quantity supports units mass percentage (mass), number percentage (number) and volume percentage (volume).

Example:
ACTUAL QUANTITY (number)
75

FILE

The file columns should contain the full filename and extension for files (images or other file types) that you want to link to that record. The file itself must be uploaded into the same dataset as the CSV template file using the Default converter. The additional files cannot be uploaded at the same time as the CSV template file as they are not to be processed with the template converter. They can be added before or after the template file is uploaded.

Example:
FILE: XRD DataFILE: SEM Image
xrd_file1.rawsem1.png

SUBSYSTEM

Any keyword can be used with the prefix SUBSYSTEM [subsystem ID]. I.e. SUBSYSTEM A, SUBSYSTEM 1 etc. This will result in the information from that column being assigned to a subsystem rather than the main system. Every unique subsystem ID will become a new subsystem of the main system in that row of the CSV file.
Example:
SUBSYSTEM A NAMESUBSYSTEM A PROPERTY: Hardness (HB)SUBSYSTEM B IDENTIFIERSUBSYSTEM B PROPERTY: Hardness (HB)
Alpha1000000001120










Feedback and Knowledge Base