MiXCR and pRESTO parameters

MiXCR parameters (non-barcoded data)

name	parameters	comments
Align	-f, -g, --noMerge, -p = kaligner2, –species = hsa, -OreadsLayout = Collinear, -OvParameters.geneFeatureToAlign = VTranscript, -OallowPartialAlignments = true
Assemble	-f, -OassemblingFeatures = FR1Begin:FR4Begin	Since sequences are cropped by the end of CDR3, FR4 region is not present in final sequences. We selected the specified parameter value since running MiXCR with seemingly more appropriate value FR1Begin:FR4Eend results in a non-stable behavior and often produces an empty repertoire.
Export clones	-f, --no-spaces, -sequence, -count, -readIds

pRESTO parameters (non-barcoded data)

name	parameters	comments
CollapseSeq	Default parameters	Although this stage can use information about primers, we do not use this information since we want to conduct primer-independent benchmarking. Although this stage can fix unspecified nucleotides (“N”s), but we do not use this feature too, since it is addressed at the preliminary alignment step.
SplitSeq	Default parameters	The stage uses a threshold parameter (--num=X) that is analogous in IgReC (discussed in Section 2.2 of the main text). In our experiments, this parameter is not fixed and estimation of its optimal value is a part of benchmarking.

pRESTO parameters (barcoded data)

name	parameters
ClusterSets	Default parameters
BuildConsensus	--prcons 0.6 --maxerror 0.1 --maxgap 0.5
CollapseSeq	--uf PRCONS --cf CONSCOUNT --act sum

Table A1. Benchmarking parameters of MiXCR (top) and pRESTO (middle) on non-barcoded datasets and pRESTO (bottom) on barcoded datasets. For all tools, we unified the read merging, alignment and filtering by using the IgReC preprocessing. After this preprocessing, all input libraries contain Ig-relevant reads that are cropped by the start of the corresponding V gene.