Agilent ICP-MS MSProfile.bin File Structure
This file format contains Agilent ICP-MS data (for example, from an Agilent 7700 single-quadrupole or an Agilent 8900 triple-quadrupole instrument). It shares its container and several index files with the HRMS format, but the MSProfile.bin payload is laid out differently and is not compressed.
ICP-MS data is distinguished from HRMS data by the presence of a
MSScan_XSpecific.bin file in the AcqData subdirectory.
The data is encoded across several files:
File |
Information |
|---|---|
MSTS.xml |
Number of retention times |
MSTS_XSpecific.xml |
Number of isotope channels (masses) |
MSScan.xsd |
File structure of MSScan.bin |
MSScan.bin |
Per-scan retention time, data offset, and point count |
MSScan_XSpecific.bin |
Presence flags the directory as ICP-MS data |
MSProfile.bin |
Intensity values (uncompressed) |
MSTS_XAddition.xml |
Real isotope m/z labels (located one level above AcqData) |
MSTS.xml and MSScan.xsd are read exactly as in the
HRMS format: the summed NumOfScans gives the number of
retention times, and the XSD defines the ScanRecordType records that
make up MSScan.bin (beginning at offset 0x58). For ICP-MS, only the scan’s
retention time, SpectrumOffset, and PointCount are needed —
the data is uncompressed, so the uncompressed byte count is not used.
MSTS_XSpecific.xml lists one Masses element per isotope channel.
Counting them gives the number of channels recorded at every retention time.
MSTS_XAddition.xml (one directory above AcqData) maps each channel index to
its real isotope m/z through ProductIonMZ. These become the m/z labels.
If the file is absent, the per-channel XValue from MSScan_XSpecific.bin
is used as a fallback.
<MSTS_XAddition_IndexedMasses>
<Index>1</Index>
<PrecursorIonMZ>12</PrecursorIonMZ>
<ProductIonMZ>12</ProductIonMZ>
</MSTS_XAddition_IndexedMasses>
MSProfile.bin stores, for each retention time, four parallel blocks of
PointCount values. Assume little-endianness.
Block |
Data Type |
|---|---|
Channel index |
Float (4 bytes each) |
Reported value (the reported intensity) |
Double (8 bytes each) |
Raw pulse count |
Double (8 bytes each) |
Analog value |
Double (8 bytes each) |
rainbow keeps the reported values, which are the intensities that MassHunter writes to its CSV export. The raw pulse and analog blocks (used for secondary-electron-multiplier detector cross-calibration) are skipped.
A single data segment can be visualized as follows. The inner blocks are not
drawn to scale, and each block holds PointCount entries.
+---------+---------++----------+----------++-------+-------++--------+--------+
| index 1 | index 2 || reported | reported || pulse | pulse || analog | analog |
+---------+---------++----------+----------++-------+-------++--------+--------+
| repeats for every retention time |
+------------------------------------------------------------------------------+
Note
ICP-MS parsing is reached through the MassHunter path, so it requires
hrms=True:
import rainbow as rb
datadir = rb.read("path/to/data.d", hrms=True)
Unlike the HRMS format, ICP-MS data is uncompressed and does not require
the optional python-lzf dependency.
This parser currently supports time-resolved acquisitions with a single tune mode and one measurement per isotope. Files with multiple tune modes (e.g. several collision/reaction gas settings) or multiple measurements per isotope are not yet handled.
The decoding of this format was contributed by Jeremy Hourigan (UC Santa Cruz); see issue #25.