Description
During reprocessing of SWA stream TM request for 2021/04/30, a bug was encountered that generates a core dump, and beak our data processing chain.
This kind of bug was already encounterd and described at:
It seems related to some uncomplete or missing TM packet in stream data incoming from EDDS.
When reprocessing 2021/04/30 from batch request TM files, the bug no more appear, as the TM data packet sequence are complete.
Identification
We could extract a very short time interval, by filtering data :
-
from 2021/04/30 01:30 to 01:40,
-
selecting only PAS 3D compressed snapshot TM data
It corresponds to a sequence of TM packets with SID in [202, 203, 204]
That results in a sequence of 18 packets from stream TM data that can reproduce the bug.
In order to compare, we also extract the same TM data but from a batch request, that generate a more complete file (38 packets).
We could verify that the same PAS software fails with the 18 TM stream packet and generate a core dump, whereas it works with the 38 TM batch packets.
So we could clearly identify the TM packet that generates the CCSDS 121 compression core dump.
The bug occurs when uncompressing the packet #11/18 (but identified as #10 in the logfile as we start from 0)
We could reproduce the bug later with a shorter sequence of 4 TM data packets (from #7 to #11 in the previous file)
Available documents
Description of these documents
-
Stream TM data hexadecimal input file that can be used to reproduce the bug (18 TM packets)
-
Corresponding CCSDS packet description
-
Batch TM data hexadecimal input file (38 packets) that works
-
Corresponding CCSDS packets description
-
python script used to extract both stream and batch TM packets
-
Shell script used to verify that stream TM data processing fails
-
Error logfile generated by CCSDS compressor
-
solo_L1_swa-pas-log_20210430_V01.log
PAS software logfile, that stop before core dump
-
Shell script to verify that batch TM data processing succeeded.
How to test
I think we can extract the TM packet #11 payload and put as input of CCSDS uncompressor to reproduce the bug.
Then check if it generates a coredump or what is the length of the uncompressed data generated.
Necessary improvement
In order to uncompress PAS 3D data, we have to call a shared CCSDS121 library, that provides the following function:
int uncompress_data (
unsigned char * buffer, // compressed input buffer
int size, // compressed input buffer size
int lci, // compression parameter
unsigned char * u_buffer, // uncompressed output buffer
int u_size); // uncompressed output buffer size
When we receive PAS 3D compressed data, it starts with a packet SID = 202.
Packet with SID = 202 give only the "number of samples", 300 currently.
Then, for each sample, we receive a packet with SID = 203, that describe characteristics of 3D sampling.
This packet allows us to compute the dimensions of original uncompressed data, what we store in a variable called expected_size
Then we can make a dynamic memory allocation for a u_buffer to receive the full data after uncompression.
We receive then a packet (or a sequence of packets) with SID = 204, that contains the compress data.
We concatenate these packets payload into an input compressed buffer and compute its size.
We call them the uncompress_data() :
-
with 3 input paramaters : buffer, size, lci,
-
and 2 output parameters : u_buffer, u_size.
That output parameter u_size will give the uncompressed size of u_buffer that contain uncompressed data.
We can see a problem here
We do not send the expected size as an input parameter So the uncompress_data() function has no way to know when to stop. For any reasons, if it receives some erroneous input data, it can generate some buffer overflow… |
How to fix
I think we must add an input parameter to uncompress_data(), giving the expected_size.
And inside this function add at least some test like:
while (uncompression_process_loop) {
...
if (u_size > expected_size) {
printf ("Output uncompressed buffer exceed %d bytes", expected_size);
return ERROR;
}
}
2021/05/11 Analysis improvement
We could progress in the analysis of this bug.
We can now isolate a short sequence of 4 CCSDS TM packet.
APID( 96,12) TM( 21, 3) # 9215 Size = 6 2021-04-30T01:36:10.828 202 58981 SWA_TM_SCI_PAS_TRIGG_SNAP_COMPR_START
APID( 96,12) TM( 21, 3) # 9216 Size = 20 2021-04-30T01:36:11.211 203 58990 SWA_TM_SCI_PAS_TRIGG_SNAP_COMPR_FIRST
APID( 96,12) TM( 21, 6) # 9220 Size = 1404 2021-04-30T01:36:12.213 204 58991 SWA_TM_SCI_PAS_TRIGG_SNAP_COMPR_DATA
APID( 96,12) TM( 21, 6) # 9225 Size = 1400 2021-04-30T01:36:13.177 204 58991 SWA_TM_SCI_PAS_TRIGG_SNAP_COMPR_DATA
The first one (SID = 202) contains the number or 3D samples included in this PAS 3D snapshot period ⇒ Number of samples = 9
The second one (SID = 203) gives desription ot the first 3D sample, with number of K, energies, elevation, CEM, taht allows to compute the size of original uncompressed data = 17820 bytes
The third and fourth packet (SID = 204) contains in their payload the compressed data ⇒
See related documents:
-
documents/2021-04-30/solo_L0_swa-pas-tm_20210430_V00.hex
The CCSDS hexadecimal file (4 CCSDS packet)
-
documents/2021-04-30/solo_L0_swa-pas-tm_20210430_V00.txt
Description of these 4 packets
-
documents/2021-04-30/solo_L1_swa-pas-log_20210430_V01.log
L1 data processing log corresponding to these 4 packets