Background

Many laboratories focussing on Dissolved Organic Matter (DOM) are using a Horiba Aqualog for their absorbance and fluorescence measurements. Judging by the workshops we host, it is the most-used instrument in the community. Amongst the users of the instrument, quite a few use it in connection with an autosampler or by using the SampleQ feature. I myself like the SampleQ feature and use it extensively whenever it makes sense (i.e. makes my life easier). This post addresses the functionality to automate processing and export. I argue that this particular instrument-software feature should be avoided in its current form if you’re planning to do anything but a quick inspection of your measurements.

The issues

Have you ever used this feature? Don’t know? Well, if your files end in ABS.dat or PEM.dat, then you have used this feature yourself. Ok, but what are the issues?

Lack of FAIRness

The files produced by the software’s SampleQ functionality all end in the same file patterns. And they all contain either absorbance data or fluorescence data. But no descriptions beyond that. Wavelength and data; that’s it. Good, that makes them easy to import. But Imagine you retrieve those files i some weeks, months, or years. Or you inherit the files from a group member that left for another job. But without their extensive lab notebook. Can you tell from the files how they were processed (blanks, IFEs, normalization, scatter)? Spoiler: No you can’t. Well, you can guess. But in some cases, it is hard to tell. And do you want to guess if you’re trying to reuse these data in a project?

For example, you won’t be able to tell if the blanks were good (if those got subtracted) or if your scatter excision settings were adequate (if you decided to cut scatter). We run PARAFAC workshops on a regular basis and see contaminated blanks every time. In the end you’ll wonder why your PARAFAC components look strange and settle for a model with few components because it’s the only stable model. But how frustrating is it that you never had a chance to get more components in the first place because your automatically subtracted blank was contaminated? Not to mention that your fluorescence intensities are offset due to that contaminated blank. Not great for global comparisons.

As authors of a toolbox in MATLAB, we often get criticism for supporting a software with closed-source code with expensive licensing policies. But the drEEM toolbox code is open and documented and it documents in great detail when and how a step was performed. The same cannot be said of the features you use in the SampleQ functionality because the files don’t show the information. But all of this functionality can be implemented using open source software. I argue that you can and should use drEEM or staRdom instead of sampleQ.

Lack of consistency in the exported files

We receive support requests from users with SampleQ-type files on a somewhat regular basis. File imports fail and the obvious culprit is the importing software toolbox, not the exporing instrument software, right?

With drEEM-2, we tried to simplify the topic of data import. We use the newer MATLAB functions (readmatrix, or readtable) that are quite smart in finding the boundaries of your files. You used to have to provide this information (“A1..B189”). But when you reuse scripts, these inputs are easy to overlook and will import strange scans if you changed resolution or wavelength coverage.

The import functions of drEEM-2 look at the first file in sequence, identify the boundaries, and then check all subsequent files for their boundaries. If there are mismatches, files will be excluded from the import to catch errors preemptively. Que the SampleQ functionality: During the processing, it will sometimes identify measurements as having produced non-sense data (e.g. absorbance too high). It will then decide to export empty cells. Not necessarily a problem, right? Except this often happens at the edges of your measurements, e.g. at short excitation. The import scripts then get confused by the empty cells and MATLAB has a hard time automatically detecting the file boundaries. End result: You see an error during the import and get frustrated. It’s the first step with this new unknown toolbox and an error right at the start is, well, not a strong start.

The solution

Don’t use the automatic export function to produce processed ABS.dat or *PEM.dat files. Instead, only produce the *.ogw files. These can be opened in the software or even Origin Pro if you don’t have the AquaLog software on hand. You can export these measurements using the more reliable “Export HJY data” function. If you read in these files, you can use the default options of importeems and importabsorbance, and I have yet to spot errors other than the occasional mismatch in the wavelength coverage that comes from using different settings.

Using this HJY export function, you blanks are always paired with your sample, and the processing is completely left to you, the expert. You’ll produce better processed EEMs and CDOM spectra and the drEEM toolbox will automatically document everything for you. We’re working hard on improving features that increase FAIRness and you’ll future-proof your own analysis pipeline by opting this way.