Simple data import

To open this page as a Matlab live script, run the command

open(which("import_tutorial_01.mlx"))

This tutorial shows how to import fluorescence and absorbance data with file triplets (sample EEM, blank EEM, absorbance spectrum). The particular dataset imported here was measured on an Horiba AquaLog.

Table of Contents

Toolbox setup

Data import

Dataset alignment

Toolbox setup

clearvars
close all force
cd(fileparts(matlab.desktop.editor.getActiveFilename))
tbx=drEEMtoolbox;

This clears all old variables (should they exist), closes all figures (if they exist) and changes the directory to the directory of this tutorial. Lastly, the toolbox is initialized as an object where it’s methds (functions) are accessible via the notation tbx.functionname.

Data import

Data import is relatively simple once you’ve found the right settings. In drEEM, importeems handles fluorescence data and importabsorbance handles absorbance scans.

cd demofiles_AL_HYI
samples=tbx.importeems('* - Waterfall Plot Sample.dat');
Checking wavelength integrity of data set
Dimension check for files passed.
1/33: CAMAS01 (01) (0.51 sec. remaining)
2/33: LCEP02 (01) (0.6 sec. remaining)
3/33: LCEP05 (01) (0.47 sec. remaining)
4/33: LCEP08 (01) (0.43 sec. remaining)
5/33: LCEP11 (01) (0.42 sec. remaining)
6/33: LCEP14 (01) (0.36 sec. remaining)
7/33: LCEP20 (01) (0.34 sec. remaining)
8/33: LCEP23 (01) (0.34 sec. remaining)
9/33: LCEP26 (01) (0.34 sec. remaining)
10/33: QS (01) (0.37 sec. remaining)
11/33: TS01 (01) (0.29 sec. remaining)
12/33: TS02 (01) (0.27 sec. remaining)
13/33: TS03 (01) (0.27 sec. remaining)
14/33: TS04 (01) (0.27 sec. remaining)
15/33: TS05 (01) (0.26 sec. remaining)
16/33: TS06 (01) (0.24 sec. remaining)
17/33: TS07 (01) (0.25 sec. remaining)
18/33: TS08 (01) (0.21 sec. remaining)
19/33: TS09 (01) (0.19 sec. remaining)
20/33: TS10 (01) (0.19 sec. remaining)
21/33: TS11 (01) (0.18 sec. remaining)
22/33: TS12 (01) (0.17 sec. remaining)
23/33: TS13 (01) (0.15 sec. remaining)
24/33: TS14 (01) (0.13 sec. remaining)
25/33: TS15 (01) (0.11 sec. remaining)
26/33: TS16 (01) (0.1 sec. remaining)
27/33: TS17 (01) (0.09 sec. remaining)
28/33: TS18 (01) (0.08 sec. remaining)
29/33: TS19 (01) (0.06 sec. remaining)
30/33: TS20 (01) (0.04 sec. remaining)
31/33: TS21 (01) (0.03 sec. remaining)
32/33: TS22 (01) (0.02 sec. remaining)
33/33: TS23 (01) (0 sec. remaining)
blanks=tbx.importeems('* - Waterfall Plot Blank.dat');
Checking wavelength integrity of data set
Dimension check for files passed.
1/33: CAMAS01 (01) (0.47 sec. remaining)
2/33: LCEP02 (01) (0.49 sec. remaining)
3/33: LCEP05 (01) (0.47 sec. remaining)
4/33: LCEP08 (01) (0.4 sec. remaining)
5/33: LCEP11 (01) (0.4 sec. remaining)
6/33: LCEP14 (01) (0.38 sec. remaining)
7/33: LCEP20 (01) (0.34 sec. remaining)
8/33: LCEP23 (01) (0.35 sec. remaining)
9/33: LCEP26 (01) (0.34 sec. remaining)
10/33: QS (01) (0.34 sec. remaining)
11/33: TS01 (01) (0.36 sec. remaining)
12/33: TS02 (01) (0.3 sec. remaining)
13/33: TS03 (01) (0.29 sec. remaining)
14/33: TS04 (01) (0.29 sec. remaining)
15/33: TS05 (01) (0.27 sec. remaining)
16/33: TS06 (01) (0.27 sec. remaining)
17/33: TS07 (01) (0.25 sec. remaining)
18/33: TS08 (01) (0.24 sec. remaining)
19/33: TS09 (01) (0.22 sec. remaining)
20/33: TS10 (01) (0.24 sec. remaining)
21/33: TS11 (01) (0.21 sec. remaining)
22/33: TS12 (01) (0.18 sec. remaining)
23/33: TS13 (01) (0.19 sec. remaining)
24/33: TS14 (01) (0.15 sec. remaining)
25/33: TS15 (01) (0.25 sec. remaining)
26/33: TS16 (01) (0.11 sec. remaining)
27/33: TS17 (01) (0.1 sec. remaining)
28/33: TS18 (01) (0.07 sec. remaining)
29/33: TS19 (01) (0.06 sec. remaining)
30/33: TS20 (01) (0.04 sec. remaining)
31/33: TS21 (01) (0.03 sec. remaining)
32/33: TS22 (01) (0.01 sec. remaining)
33/33: TS23 (01) (0 sec. remaining)
absorbance=tbx.importabsorbance('* - Abs Spectra Graphs.dat');
Checking wavelength integrity of data set
Dimension check for files passed.
1/32: CAMAS01 (01) (0.21 sec. remaining)
2/32: LCEP02 (01) (0.21 sec. remaining)
3/32: LCEP05 (01) (0.2 sec. remaining)
4/32: LCEP08 (01) (0.19 sec. remaining)
5/32: LCEP11 (01) (0.17 sec. remaining)
6/32: LCEP14 (01) (0.16 sec. remaining)
7/32: LCEP20 (01) (0.13 sec. remaining)
8/32: LCEP23 (01) (0.14 sec. remaining)
9/32: LCEP26 (01) (0.13 sec. remaining)
10/32: TS01 (01) (0.11 sec. remaining)
11/32: TS02 (01) (0.1 sec. remaining)
12/32: TS03 (01) (0.09 sec. remaining)
13/32: TS04 (01) (0.11 sec. remaining)
14/32: TS05 (01) (0.11 sec. remaining)
15/32: TS06 (01) (0.1 sec. remaining)
16/32: TS07 (01) (0.09 sec. remaining)
17/32: TS08 (01) (0.1 sec. remaining)
18/32: TS09 (01) (0.12 sec. remaining)
19/32: TS10 (01) (0.1 sec. remaining)
20/32: TS11 (01) (0.08 sec. remaining)
21/32: TS12 (01) (0.08 sec. remaining)
22/32: TS13 (01) (0.07 sec. remaining)
23/32: TS14 (01) (0.05 sec. remaining)
24/32: TS15 (01) (0.05 sec. remaining)
25/32: TS16 (01) (0.04 sec. remaining)
26/32: TS17 (01) (0.04 sec. remaining)
27/32: TS18 (01) (0.03 sec. remaining)
28/32: TS19 (01) (0.02 sec. remaining)
29/32: TS20 (01) (0.02 sec. remaining)
30/32: TS21 (01) (0.01 sec. remaining)
31/32: TS22 (01) (0 sec. remaining)
32/32: TS23 (01) (0 sec. remaining)
cd ..

Here, we change the direcory to the folder in which the tutorial files are stored. Then, we import the sample EEMs with the relevant file pattern. First the samples, then the blanks. Note that these patterns will be deleted from the filenames in the dataset since they are repetitve and don’t have “value” in themselves. Leading or trailing empty characters are also automatically removed. Lastly, the absorbance data gets imported.

This tutorial is referred to as the “simple” data import tutorial since the default settings in the two importing functions fit the file type perfectly. For files with different format, you will have to take a look at different tutorials and / or explore the function documentation for importeems and importabsorbance. Customization mainly concern the EEM orientation and where the required information is stored in the file(s). On the other hand, the function automatically determines the number of columns and rows to import.

As you can see, there’s some console output to keep us updated. After completion, a figure pops up to ask us about the status of the imported samples.

Why do both functions ask about the dataset? Because FAIR data starts with telling the software what the status of processing is so it can keep track of things. Read more about the status propery in the documentation of drEEMstatus.

In this case, the default options for the status are appropriate. Note that in your case, the dataset status might differ and you should use this opportunity to provide the correct information. We repeat this step for the blank EEMs and absorbance spectra.

Next, we delete some information from the filenames to demonstrate how this could be done. The string ' (01)' refers to the first replicate measurement of the sample. In this example, this information is irrelevant and can be deleted. If a sample was remeasured for some reason, a string ' (02)' remains in the filename to let us know.

samples.filelist=erase(samples.filelist,{' (01)'});
blanks.filelist=erase(blanks.filelist,{' (01)'});
absorbance.filelist=erase(absorbance.filelist,{' (01)'});
samples=tbx.addcomment(samples,"Erased the ' (01)' string pattern from sample names.","newline");

This is a manual dataset modification in the broadest sense. In the spirt of reproducibility and tracability, we leave a comment in the dataset to remind us of what was done.

Dataset alignment

It is entirely possible that one of the three files went missing in some cases for some samples. Before we continue, we need to make sure that only complete triplets remain in the three datasets. Incomplete triplets will be deleted. This is ensured with the function alignsamples. It compares the contents of the variable filelist within each drEEMdataset supplied. Again, in the case of the “simple” tutorial here, the filenames are all identical after the import, so this step should not be difficult. But, since this is a potentially troubling step, we can run the operation in diagnostic mode to see what would happen to our datasets if we assigned an output:

tbx.alignsamples(samples,blanks,absorbance);
Diagnostic mode, no output will be assigned (no variable was specified)
samples: removed 1 samples with the previous .i: 10
blanks: removed 1 samples with the previous .i: 10
absorbance: No action necessary. All files between supplied datasets had same names.

The console output let’s us know that one sample would be deleted from the EEM sample and blank dataset. What’s missing? The absorbance scan. The diagnostics function creates a figure that shows us exactly that. We could go looking for the absorbance scan now. Perhaps reexport the data etc. But here, we will simply delete.

[samples,blanks,absorbance]=tbx.alignsamples(samples,blanks,absorbance);
samples: removed 1 samples with the previous .i: 10
blanks: removed 1 samples with the previous .i: 10
absorbance: No action necessary. All files between supplied datasets had same names.

Now, you can see that one sample was removed in the sample and blank EEM datasets. At this stage, it is safe to transfer the absorbance data into the sample dataset to cut down on the workspace variables:

samples.absWave=absorbance.absWave;
samples.abs=absorbance.abs;
tbx.validatedataset(samples);
clearvars absorbance
samples=tbx.addcomment(samples,"transferred absorbance to the sample EEM dataset","newline");

We leave a comment in the dataset’s history to remind ourselves of what we have done using addcomment. Next, the data comes with a Excel metadata file. Let’s bring that table in. We use associatemetadata for this task.

samples=tbx.associatemetadata(samples,"metadata.xlsx","sampleId")
                     drEEMdataset    imported metadata table
                     ____________    _______________________

    No. samples           32                   32           
    Total matched         31                   31           
    % matched             97                   97           
samples = 
  drEEMdataset with properties:

            history: [5x1 drEEMhistory]
                  X: [32x250x73 double]
                abs: [32x73 double]
        suppSpectra: []
           filelist: {32x1 cell}
                  i: [32x1 double]
                 Ex: [73x1 double]
                 Em: [250x1 double]
                nEx: 73
                nEm: 250
            absWave: [73x1 double]
    suppSpectraAxis: [0x1 double]
            nSample: 32
             models: [0x1 drEEMmodel]
           metadata: [32x15 table]
    opticalMetadata: [0x0 table]
              split: [0x0 drEEMdataset]
             status: [1x1 drEEMstatus]
           userdata: []
     instrumentInfo: [0x0 struct]
    measurementInfo: [0x0 struct]

Now we can take a look at the dataset’s history with displayhistory to demonstrate how this could be useful:

tbx.displayhistory(samples)
    #            Function            User comment                      Function message                  
    _    ________________________    ____________    ____________________________________________________

    1    importeems                       ""         created dataset                                     
    2    addcomment: user comment         ""         Erased the ' (01)' string pattern from sample names.
    3    alignsamples                     ""         removed 1 samples with the previous .i: 10          
    4    addcomment: user comment         ""         transferred absorbance to the sample EEM dataset    
    5    associatemetadata                ""         Metadata associated (look at data.metadata)         

You could also use the GUI tool viewhistory for this:

tbx.viewhistory(samples)

The color coding in the GUI tool is designed to distinguish between different functions. I.e. if you called subdataset many times, these lines will have the same color. Now we can save the dataset for later use.

save('tutorial_01_imported.mat',"blanks","samples")
disp('Tutorial finished.')
Tutorial finished.

This is the end of the tutorial. If you want to go futher, have a look at the data processing tutorial.