Documents
-
Metadata checker logfiles:
-
Solar Orbiter Observing plans (SOOPS)
Main issue for SWA-PAS datasets
The main issue for SWA-PAS CDF datasets concerns CDF variable names.
Extracted from Solar orbiter metadata dictionnary V.2.5
3.2.1.1 General conventions
The general conventions for the CDF variables for Solar Orbiter are provided in the following list:
* CDF variable description and naming conventions shall be compliant with the ISTP
guidelines. In addition, CDF variable names shall contain capital letters only and shall not exceed 63 characters.
This rule is a good one, but unfortunately, we didn’t respect it when creating our first CDF files.
We are using a mix of uppercase, lowercase or capitalized variable names.
As an example, for solo_L1_swa-pas-3d dataset:
Variables: [
'Epoch',
'Duration', 'CCSDS_time', 'SCET',
'SOURCE', 'SAMPLE', 'NB_SAMPLE',
'FIRST_ENERGY', 'NB_ENERGY',
'FIRST_ELEVATION', 'NB_ELEVATION',
'FIRST_CEM', 'NB_CEM',
'INFO', 'SCHEME', 'FULL_3D', 'COMPRESSED',
'MAX_CNT_ENERGY', 'MAX_CNT_ELEVATION', 'MAX_CNT_CEM',
'NB_K', 'K',
'COUNTS',
'Energy', 'delta_p_Energy', 'delta_m_Energy',
'Azimuth', 'Elevation',
'delta_Azimuth', 'delta_Elevation']
Implications
We can easily modify our variable names to put them in uppercase.
It will imply a full reprocessing and delivery of all PAS L1/L2 CDFs for the whole mission, but anyway, it will be the case as for any update of CDF metadata
We will also have to modify/update all the softwares that use these CDF files :
-
produce_L1
C software that create PAS L1 CDFs from L0 telemetry files
-
produce_L2
python software that create PAS L2 CDFs from L1 ones
-
produce_L3
IDL + python software that create PAS L3 CDFs from PAS L2 swa-pas-vdf ones
-
tools
Various python tools to manipulate PAS CDFs (plots, checks, statistics…)
-
cl software :
IDL software written by Emmanuel Penou to plot our Solar Orbiter datasets and many other experiments
-
AMDA software
Multi-mission analysis tools written by IRAP to plot our datasets
-
SOAR
Probably some implications in the SOAR SQL databases that describe SWA-PAS datasets
Most of these modifications are easy to do (replacing CDF varnames by their uppercase value), but the various softwares will probably not accept to work with a mix of older/newer datasets.
⇒ it will be necessary to switch at a given time the whole PAS CDF files and softwares, from older to newer ones
⇒ not so easy to do
Metadata checker
It should be interesting to install a copy of the CDF metadata checker on our computers, to be sure that all rules are applied, before delivering a new CDF file to MSSL/SOAR
It should avoid unnecessary return trips between IRAP and SOAR archive
Is it possible to have a copy of the metadata checker ? Is it a python tool ? |
Do you think this tools will continue to evolve ?
It so, it can detect later some new discrepancies that will imply a new full delivery of our CDFs.
450 GB for PAS L1 data, 880 for PAS L2
Global attributes
Missing global attributes
-
Software_version : missing for L1 datasets
-
TEXT : missing for L1 datasets
-
TARGET_NAME
-
TARGET_CLASS
-
TARGET_REGION
-
TIME_MIN : should be extracted : Epoch[0]
-
TIME_MAX : should be extracted : Epoch[-1]
-
SOOP_NAME
-
SOOP_TYPE
-
OBS_ID
-
LEVEL : missing
-
Instrument : missing
-
Data_product : missing
-
Acknowledgement : empty
-
Parents : missing for L1 datasets, OK for L2
There is not technical issue to update these attributes
Global attributes to update
Some messages of the metadata checker are a bit strange :
Source_name is ['SOLO>Solar Orbiter']
From dict, it should be: 'SOLO>Solar Orbiter'
Instrument is ['SWA-PAS>Solar Wind Analyser-Proton Alpha Sensor']
From dict, it should be: 'SWA-PAS>Solar Wind Analyser Proton Alpha Sensor'
I don’t understand these messages. Is is OK or not?
Some attributes are to be updated:
-
Descriptor
Descriptor is ['SWA-PAS>Solar Wind Analyser / Proton-Alpha Sensor']
From dict, it should be: 'SWA-PAS-3D>Solar Wind Analyser, Proton Alpha Sensor, etc'
Some attributes are logically dependent :
-
CDF filename = "solo_L2_swa-pas-vdf_20231127_V01.cdf"
-
Logical_file_id = "solo_L2_swa-pas-vdf_20231127_V01"
-
Logical_source = "solo_L2_swa-pas-vdf"
-
Instrument = "SWA-PAS>Solar Wind Analyser, Proton-Alpha Sensor"
-
Data_type = "L2> Level 2 Data"
-
LEVEL = "L2>Level 2 Data"
-
Data_product = "VDF>Velocity Distibution function"
-
Descriptor = "SWA-PAS-VDF>SWA PAS Velocity distribution function"
Was not so clear when reading the metadata standard document.
We have to check/update all these global attributes for each L1/L2/L3 datasets
CDF variables
Missing CDF variables
-
QUALITY_FLAG
Currently we have no QUALITY_FLAG variable in our CDFs, but a variable quality_factor in our L2 datasets
It’s planned to add QUALITY_FLAG in L2 files (computed from quality_factor)
what to do with PAS L1 files? Add QUALITY_FLAG with a default value? Try to define a real quality flag? |
-
QUALITY_BITMASK
We are not currently using this QUALITY_BITMASK value.
Do we have to add this QUALITY_BITMASK, with a default 0 value? And use some of these bits later… |
Variable attributes
We have a lot of attributes that will have to be updated or added.
-
SI_CONVERSION
-
VAR_NOTES
-
FORMAT
-
DISPLAY_TYPE
We have to feed these attributes…
Technical considerations
The use of CDF files implies a full reprocessing of all the CDF files for the whole mission for any modification of the metadata.
We have to create a new CDF file, incrementing its version number, and deliver this new CDF file to MSSL, then to SOAR.
Reprocessing/patching
Internally, we can easily modify the content of a CDF metadata, WITHOUT creating a new CDF file.
from spacepy import pycdf from datetime import datetime filename = "solo_L1_swa-pas-mom_20231111_V01.cdf" with pycdf.CDF (filename, readonly = False) as cdf: # add/modify a CDF global attribute cdf.attrs["NEW_ATTRIBUTE"] = "add some text here" cdf.attrs["Generation_date"] = datetime.now().isoformat(timespec="seconds") # add/modify some variable attribute cdf["Epoch"].attrs["VAR_NOTES"] = "add another text here" # add a new variable cdf.new ("QUALITY_FLAG", recVary = True, type = pycdf.const.CDF_INT) cdf["QUALITY_FLAG"] = [0] * len (cdf["Epoch"]) # rename a variable (uppercase) cdf["Epoch"].rename ("EPOCH") # Modification are written in the original file
Do you think possible to apply such kind to software patch in SOAR CDF files, on a given dataset, to make some minor updates? It could avoid redelivery of one or more datasets for the whole mission. |
Otherwise, we have to make a copy of each CDF files, update the copy and deliver the new one to MSSL ⇒ SOAR
$ cp solo_L1_swa-pas-mom_20231111_V01.cdf solo_L1_swa-pas-mom_20231111_V02.cdf
$ update_metadata solo_L1_swa-pas-mom_20231111_V02.cdf
$ PUT_MSSL solo_L1_swa-pas-mom_20231111_V02.cdf