Data import with sample log
Data Import tutorial “PortSurvey”
This is a tutorial that will demonstrate how to import fluorometer files into MATLAB and how to assemble a dataset.
This file shows example code for assembling a multi-way EEM dataset from raw (instrument) data files, including signal normalization and/or dilution correction.
You can copy & paste the code sections in this tutorial directly into your own MATLAB script. If this document is open in the MATLAB help browser (called with doc
), you can right-click on code sections and evaluate them line-by-line (also by pressing F9
on windows)
f you want the code with minimal comments, run:
open(which('drEEM_dataImport.m'))
The drEEM toolbox must already be installed for this command to work.
The sections of this tutorial
- Data Import tutorial “PortSurvey”
Set up the toolbox
Before we start this tutorial, we need to make sure that the drEEM toolbox is properly installed and the functions are ready to import a example dataset. This set of data is called demofiles.zip
and is stored in the drEEM toolbox folders.
If you downloaded the drEEM toolbox to your matlab user folder (the folder returned when you call userpath
), simply call
cd userpath
otherwise, say:
cd 'C:\Users\expertuser\somefolder\drEEM\' % Change this!
then, we ensure proper installation of drEEM and availability of the demo files:
dreeminstall
unzip('demofiles.zip',fileparts(which('demofiles.zip')))
Read in raw data files
In this section, we will take csv
-files that contain actual fluorescence data. First, the sample-EEMs are imported:
cd([demopath '/EEMs']) % You must adjust demopath to your own comupters path!
filetype=1;ext = 'csv';RangeIn='A1..AU105';headers=[1 1];display_opt=0;outdat=2;
[X,Emmat,Exmat,filelist_eem,outdata]=readineems(filetype,ext,RangeIn,headers,display_opt,outdat);
Ex=Exmat(1,:); %Since all files have the same excitation wavelengths
Em=Emmat(:,1); %Since all files have the same emission wavelengths
Then, the corresponding blank-EEMs are imported.
cd([demopath '/BlankEEMs'])
filetype=1;ext = 'csv';RangeIn='A1..AU105';headers=[1 1];display_opt=0;outdat=2;
[X_b,Emmat_b,Exmat_b,filelist_b,outdata_b]=readineems(filetype,ext,RangeIn,headers,display_opt,outdat);
Exb=Exmat_b(1,:); %Since all files have the same excitation wavelengths
Emb=Emmat_b(:,1); %Since all files have the same emission wavelengths
Now, we will import Raman scans at excitation wavelength 275nm:
cd([demopath '/Raman275'])
filetype='R275';ext = 'csv';RangeIn='A2..B167';display_opt=0;outdat=2;
[S_R,W_R,wave_R,filelist_R]=readinscans(filetype,ext,RangeIn,display_opt,outdat);
RamEx=275; %note landa <> 350, so we will need to calculate RamOpt below
In this tutorial, we will correct inner filter effects via the absorbance-based method:
cd([demopath '/ABS1cm'])
filetype='Abs';ext = 'csv';RangeIn='A1..B521';display_opt=0;outdat=0;
[S_abs,W_abs,wave_abs,filelist_abs]=readinscans(filetype,ext,RangeIn,display_opt,outdat);
To correct for instrument biases, we need to import the spectral correction factors:
cd([demopath '/CorrectionFiles'])
Excor=csvread('xc06se06n.csv');
Emcor=csvread('mcorrs_4nm.csv');
Now, we import quinine sulfate emission scans at an exciation of 275 & 350 nm:
cd([demopath '/QS/Raman275'])
filetype='QSuv275';ext = 'csv';RangeIn='A2..B75';display_opt=0.5;outdat=0;
[S_qsuv275,W_qsuv275,wave_qsuv275,filelist_qsuv275]=readinscans(filetype,ext,RangeIn,display_opt,outdat);
cd([demopath '/QS/Raman350'])
filetype='QSuv350';ext = 'csv';RangeIn='A2..B75';display_opt=0.5;outdat=0;
[S_qsuv350,W_qsuv350,wave_qsuv350,filelist_qsuv350]=readinscans(filetype,ext,RangeIn,display_opt,outdat);
Align data according to records held in a Sample Log
A useful feature of drEEM is that it allows you to align fluorescence data with other sample data that you may want to relate to fluorescence results. First, you must import this metadata:
cd(demopath)
% Import sample log
filename='D:\Collaborations\PARAFAC tutorial\drEEM parts\demo\SampleLog_PortSurveyDemo.csv';
strings=[0 1 1 1 0 0 1 1 1 1 1 1 0 1 0 1]; %specify which columns in the log contain text
SampleLog=readlogfile(filename,strings);
Then, we must make sure that EEMs are properly aligned with metadata. E.g. the 5th sample in the sample-log describes the 5th EEM in the dataset.
% Align data according to the SampleLog. The EEMs are listed in SampleLog.Log_EEMfile
% first align raw data from SampleLog with filelist_eem
sites= alignds(SampleLog,{'EEMfile',filelist_eem},{'Site'}); %Site names are listed in SampleLog.Site
cruises= alignds(SampleLog,{'EEMfile',filelist_eem},{'Cruise'}); %Cruise names are listed in SampleLog.Cruise
dates= alignds(SampleLog,{'EEMfile',filelist_eem},{'AnalDate'});
Q= alignds(SampleLog,{'EEMfile',filelist_eem},{'QS_Slope'});
sampleID= alignds(SampleLog,{'EEMfile',filelist_eem},{'SampID'});
replicates= alignds(SampleLog,{'EEMfile',filelist_eem},{'Rep'});
dilfac= alignds(SampleLog,{'EEMfile',filelist_eem},{'dilutionfactor'});
Next, we align previously loaded datasets with filelist_eem (ABS, Raman, Blanks, etc):
Sabs= alignds(SampleLog,{'EEMfile',filelist_eem},{'ABSfile',filelist_abs,S_abs}); %ABS data files are listed in SampleLog.ABSfile
B= alignds(SampleLog,{'EEMfile',filelist_eem},{'BlankFile',filelist_b,X_b}); %blank EEM iles are listed in SampleLog.BlankFile,Blist is used later
Sr= alignds(SampleLog,{'EEMfile',filelist_eem},{'RamanFile',filelist_R,S_R});
Sqsuv275= alignds(SampleLog,{'EEMfile',filelist_eem},{'Qw',filelist_qsuv275,S_qsuv275});
Sqsuv350= alignds(SampleLog,{'EEMfile',filelist_eem},{'Qw',filelist_qsuv350,S_qsuv350});
Correct the EEMs
Here, we correct EEMs for different artefacts.
cd([demopath '/Corrected_EEMs'])
In this particular case, we must line up wavelength information with the quinine sulfate dilution series data.
% Attach Wavelength headers to scans
A=[wave_abs;Sabs]; %add wavelengths to Absorbance scans
Qw275=[wave_qsuv275;Sqsuv275]; %2D Matrix of Raman scans matched to QS dilution series
Qw350=[wave_qsuv350;Sqsuv350]; %2D Matrix of Raman scans matched to QS dilution series
Then, we eliminate data below or above the wavelength range of Ex and Em correction files
Em_in=Em(Em<=682);
X_in=X(:,Em<=682,:);
B_in=B(:,Em<=682,:);
Then, we use assembledataset
to convert the various arrays or matrices containing bits of information into a data structure. These are useful in MATLAB to keep connected variables in containers for easy reference.
DS=assembledataset(X_in,Ex,Em_in,'AU','filelist',filelist_eem,[]) %#ok<*NOPTS>
Next, we must account for the fact that fluorescence intensities depend on the lamp used to excite DOM. This step is called “Normalization” and can be done by dividing fluorescence of DOM by Raman peak intensities or quinine sulfate emission integrals.
Method 1 - using Raman scans at Ex = 275 nm.
First we need to locate an optimal integration range that will capture the peak at 275 nm cd([demopath ‘/Corrected_EEMs/method1’]) W=[wave_R;Sr]; %2D Matrix of matched (275 nm) Raman scans RamMat=[wave_R;S_R]; %2D Matrix of unmatched Raman scans (corresponding to filelist_R) [IR,IRmed,IRdiff] = ramanintegrationrange(RamMat,filelist_R,RamEx,1800,6,0,0.1);
From IRmed = [288 320] we choose an integration range of 290-320 nm noting that no correction factors exist for Em<290 nm in our specific case.
RamOpt=[275 290 320];
W_in=W(:,wave_R>=290);
Qw_in=Qw275(:,wave_qsuv275>=290);
[XcRU1, Arp1, IFCmat1, BcRU1, XcQS1, QS_RU1]=fdomcorrect(DS,DS.Ex,DS.Em,Emcor,Excor,W_in,RamOpt,A,B_in,[],Q,Qw_in);
Method 2 - using Raman scans at Ex=350 nm
These scans at Ex = 350 nm will be extracted from the MilliQ blanks. First determine the optimal Raman integration range using the blanks in X_b which contain no duplicate EEMs. The best range is 374-426 nm (IRmed) however, we have no QS blank measurements below 375 nm, so choose 375 as the lower range.
W350 = [Emb'; squeeze(X_b(:,:,Exb==350))]; %Extract the water Raman scans at Ex=350
[IR,IRmed,IRdiff] = ramanintegrationrange(W350,filelist_b,350,1800,4,0,0.1)
RamOpt=[350 375 426]; %calculated Raman integration range: 374-426nm
RamOpt=[] %default Raman integration range: 381-426nm
cd([demopath '/Corrected_EEMs/method2'])
W = [Emb'; squeeze(B(:,:,Exb==350))]; %Extract the water Raman scans at Ex=350
Qw=Qw350;
[XcRU2, Arp2, IFCmat2, BcRU2, XcQS2, QS_RU2]=fdomcorrect(DS,DS.Ex,DS.Em,Emcor,Excor,W,RamOpt,A,B_in,[],Q,Qw);
Next, we calculate the QS/RU conversion factors for the three QS dilution series and check the ratios are stable
QSonRU=mean(unique(QS_RU2));
fprintf('\n\nQS/RU conversion factors by Method 2:\n')
disp(unique(QS_RU2))
fprintf('...having a mean value of: \n')
disp(QSonRU) % (88.96-90.05 with the default Raman integration range)
Method 3 - Select the ‘Best available’ corrected datasets
For QSE:
XcQS=XcQS1; %Method 1
%For RU:
XcRU=XcQS1/QSonRU; %Method 1 divided by mean QS/RU ratio from Method 2
Optional - correct for dilution if necessary
If samples were diluted before measuring EEMs and Abs scans, divide the corrected EEMs by the sorted dilution factors (dilfac NOT df).
XcQS_df1=undilute(XcQS,dilfac);
XcRU_df1=undilute(XcRU,dilfac);
fprintf('\n\n');
fprintf(' sampleID diluted undiluted\n');
f350_450=[sampleID XcQS(:,Em==450,Ex==350) XcQS_df1(:,Em==450,Ex==350)];
disp(f350_450); %compare diluted vs undiluted at 350/450
Save the results
In the last step, we simply save the data so that we do not have to repeat the import again.
cd([demopath '/Corrected_EEMs'])
save PortSurveyDataset.mat XcQS XcRU Ex Em_in filelist_eem sites replicates cruises dates sampleID
xlswrite('OutData_FDOM.xlsx',cellstr(filelist_eem),'filelist')
cd(demopath)
cd([demopath '/Corrected_EEMs/method1'])
save('CorrData_FDOM.mat','filelist_eem','IR','IRmed','IRdiff','RamEx','-append')
xlswrite('OutData_FDOM.xlsx',cellstr(filelist_eem),'filelist')
cd([demopath '/Corrected_EEMs/method2'])
save('CorrData_FDOM.mat','filelist_eem','QSonRU','-append')
xlswrite('OutData_FDOM.xlsx',cellstr(filelist_eem),'filelist')