Tax4Fun - Ошибка длины данных импорта
Я пытаюсь запустить пакет R Tax4Fun для прогнозирования функциональных возможностей на основе данных 16S. У меня есть таблица OTU, полученная благодаря конвейеру мета-кодирования (VSEARCH, CUTADAPT и SWARM). Моя таблица OTU выглядит так:
OTU total cloud amplicon length abundance chimera spread quality sequence identity taxonomy references PM-18S6ext_TATGCG_p16S PM-18SA_TTTTTC_p16S PM-18SB_GTCGTG_p16S PM-18SD_GCTATC_p16S PM-18SLAZ_TTTGTA_p16S PM-18SMIS_AGCCTG_p16S PM-18SQN_CAACAG_p16S PM-finaMIS_CTCTCG_p16S PM-finaTfA_TGCTGA_p16S PM-final6ext_GGTAGC_p16S PM-finalA_AATATG_p16S PM-finalB_TGAGCA_p16S PM-finalD_AATCAC_p16S PM-finalLAZ_GCAAAT_p16S PM-finalQN_AAATTG_p16S PM-finalT041p_CTGTGC_p16S PM-finalT0MIS_ACATAT_p16S PM-finalTfC_GAGCTT_p16S PM-finalTfD_TCTCGG_p16PM-finalTfE_AGCGAC_p16S PM-finalTfF_GCCAAG_p16S PM-finalTfG_AGGTTC_p16S
1 2007 1129 f5579f91a5ca1c9ce3fe1ffe992c3dda4c718025 392 385 N 8 0.000153316326531 ctccgtgccagcagccgcggtaatacgggggatgcaagcgttatccggaatcattgggcgtaaagcgcctgtaggttgtttaataagtctgttgttaaagactagggcttaaccctagaaaagcaatggaaactactagactagagtatggcaggggtagagggaatttctagtgtagcggtgaaatgcgtagatattagaaagaacaccggtggcgaaagcgctctactggaccattactgacactcagaggcgaaagctagggtagcaaaagggattagatacccctgtagtcctagccgtaaacgatggatactacatgttgtgcattatgtacagtatggtagctaacgcgttaagtatcccgcctggggagtacgctcgcaagggtg 100.0 Bacteria|Cyanobacteria|Oxyphotobacteria|Chloroplast|* JQ197833.1.1304,JQ197512.1.1304,JQ196309.1.1304,JQ197835.1.1304,JQ199895.1.1304,JQ199480.1.1304,HM127811.1.1405,JX016414.1.1441,JX016490.1.1441,JX016561.1.1441,JX016673.1.1441,JX016514.1.1441,JX017009.1.1441,JX017095.1.1441,GQ346755.1.1323,GQ348584.1.1327,JX537806.1.1308,JX537897.1.1301,KC001577.1.1258,KC002381.1.1287,EF574706.1.1441,GU235484.1.1316 0 0 0 0 0 0 0 18 0 27 1 25 0 0 0 1925
2 1883 1190 f98930d400ae7efe86292c4c417c3cb61a3ce816 399 317 N 11 0.000154887218045 ctccgtgccagcagccgcggtaagacggaggatgcaagtgttattcggaatgattgggcgtaaagagtctgtaggccgtatagaaagtcttttgttaaatgcctcggctcaaccgagatccagcaaaggaaacttctatacttgagggaagtagaggtacagggaattcccggtggagcggtgaaatgcgtagatatcgggaggaacaccaatatggcgaaggcactgtactgggcttttcctgacgctgagagacgaaagctaaaggagtgattaggattagataccctagtaattttagccgtaaacgatggaaactcactgccgagcgaaatacaacgagcggtggtcaagctaacgcgtgaagtttcccgcctggggattacgcttgcaaaagtg 100.0 Bacteria|Cyanobacteria|Oxyphotobacteria|Chloroplast|* JQ195238.1.1313,JQ195420.1.1313,JQ195905.1.1315,JQ195346.1.1313,JQ196571.1.1313,JQ196735.1.1313,JQ195734.1.1313,JQ197124.1.1313,JQ197306.1.1313,JQ197298.1.1316,JQ196440.1.1313,JQ197433.1.1313,JQ197434.1.1313,JQ197626.1.1313,JQ196721.1.1313,JQ197965.1.1313,JQ198009.1.1313,JQ198126.1.1313,JQ198209.1.1313,JQ198219.1.1313,JQ198172.1.1313,JQ197508.1.1313,JQ197620.1.1313,JQ197637.1.1313,JQ197478.1.1313,JQ198623.1.1313,JQ196081.1.1313,JQ197596.1.1313,JQ197876.1.1313,JQ197823.1.1313,JQ198855.1.1313,JQ197700.1.1313,JQ196285.1.1313,JQ197670.1.1313,JQ197791.1.1313,JQ198169.1.1313,JQ196667.1.1313,JQ198105.1.1313,JQ198314.1.1313,JQ198542.1.1313,JQ199324.1.1313,JQ198561.1.1313,JQ198536.1.1313,JQ198564.1.1313,JQ199623.1.1313,JQ199625.1.1314,JQ199695.1.1313,JQ197497.1.1313,JQ199779.1.1313,JQ199794.1.1313,JQ199926.1.1313,JQ199323.1.1313,JQ199927.1.1313,JQ199350.1.1313,JQ198208.1.1313,JQ198233.1.1313,JQ200223.1.1313,JQ198385.1.1313,JQ199746.1.1313,JQ199948.1.1313,JQ200203.1.1313,JQ200229.1.1313,JQ200084.1.1313,JQ200011.1.1313,JQ200105.1.1313,JQ200016.1.1313,JQ199460.1.1313,JQ199585.1.1314,FN563097.1.1453,AM747382.1.1283,JX016606.1.1451,JX017016.1.1451,JX017124.1.1451,JX537822.1.1341,JX537878.1.1320,KC545747.1.1423,KF596584.1.1451 0 0 0 409 0 726 193 0 1 220 68 59 76 0 0 121 7
3 1594 916 cb9c37c9c68b234a7f1150c813737a5123e77dc4 392 322 N 5 0.000154591836735 ctccgtgccagcagccgcggtaatacgggggatgcaagcgttatccggaatcattgggcgtaaagcgcctgtaggttgtttaataagtctgttgttaaagactagggctcaaccctagaaaagcaatggaaactactagactagagtatggcaggggtagagggaatttctagtgtagcggtgaaatgcgtagatattagaaagaacaccggtggcgaaagcgctctactggaccattactgacactcagaggcgaaagctagggtagcaaaagggattagatacccctgtagtcctagccgtaaacgatggatactaggtgttgtatttattttacagtatcgtagctaacgcgttaagtatcccgcctggggagtacgctcgcaagggtg 100.0 Bacteria|Cyanobacteria|Oxyphotobacteria|Chloroplast|uncultured_marine_eukaryote KX937536.1.1443,KX937534.1.1443,KX937535.1.1443 0 0 0 0 0 0 0 15 1 0 1449 89 40
4 976 592 18cf185a9728b2e4e0bf8479578313d312c4702b 397 182 N 6 0.000161712846348 ctccgtgccagcagccgcggtaagacggaggatgcaagtgttatccggaatcactgggcgtaaagcgtctgtaggtggtttaataagtcaactgttaaatcttgaggctcaacctcaaaatcgcagtcgaaactgttagactagagtatagtaggggtaaagggaatttccagtggagcggtgaaatgcgtagagattggaaggaacaccgatggcgaaggcactttactgggctattactaacactcagagacgaaagctagggtagcaaatgggattagataccccagtagtcctagctgtaaacaatggatactagatgttgaacagatcgacctgtgcagtatcaaagctaacgcgttaagtatcccgcctgggaagtatgctcgcaagagtg 99.5 Bacteria|Cyanobacteria|Oxyphotobacteria|Chloroplast|* FM242284.1.1439,KU243249.1.1264,FJ002183.1.1431,KF771485.1.1443,JF272135.1.1444 1 0 0 0 0 791 6 166
Tax4Fun поддерживает различные форматы вывода QIIME. Чтобы импортировать мои данные и использовать их, мне нужно преобразовать эту таблицу OTU в таблицу, сгенерированную QIIME (.qotu). Мой вывод выглядит так:
#OTU PM-18S6ext_TATGCG_p16S PM-18SA_TTTTTC_p16S PM-18SB_GTCGTG_p16S PM-18SD_GCTATC_p16S PM-18SLAZ_TTTGTA_p16S PM-18SMIS_AGCCTG_p16S PM-18SQN_CAACAG_p16S PM-finaMIS_CTCTCG_p16S PM-finaTfA_TGCTGA_p16S PM-final6ext_GGTAGC_p16S PM-finalA_AATATG_p16S PM-finalB_TGAGCA_p16S PM-finalD_AATCAC_p16S PM-finalLAZ_GCAAAT_p16S PM-finalQN_AAATTG_p16S PM-finalT041p_CTGTGC_p16S PM-finalT0MIS_ACATAT_p16S PM-finalTfC_GAGCTT_p16S PM-finalTfD_TCTCGG_p16S PM-finalTfE_AGCGAC_p16S PM-finalTfF_GCCAAG_p16S PM-finalTfG_AGGTTC_p16S taxonomy*
1 0 0 0 0 0 0 0 18 0 27 1 0 25 0 0 0 1925 k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__*; g__; s__
2 0 0 0 0 0 0 0 409 0 726 193 0 220 68 59 76 0 0 121 3 7 k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__*; g__; s__
3 0 0 0 0 0 0 0 0 0 0 0 0 15 1 0 1449 89 40 k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__uncultured_marine_eukaryote; g__; s__
4 1 0 0 0 0 0 0 0 0 0 0 0 791 6 166 k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__*; g__; s__
После этого я преобразовал свою таблицу QIIME OTU в файл biom и добавил метаданные следующим образом:
biom convert -i "${FILTERED_QOTU}" -o $FILTERED_BIOM --table-type="OTU table" --process-obs-metadata taxonomy --to-json
biom add-metadata -i $FILTERED_BIOM -o $FILTERED_BIOM_ANNOTATED --sample-metadata-fp "../tags_list.txt" --output-as-json
У меня был этот файл биома:
{"id": "None","format": "Biological Observation Matrix 1.0.0","format_url": "http://biom-format.org","type": "OTU table","generated_by": "BIOM-Format 1.3.1","date": "2018-02-27T16:18:17.853139","matrix_type": "sparse","matrix_element_type": "float","shape": [26, 22],"data": [[0,7,18.0],[0,9,27.0],[0,10,1.0],[0,14,4.0],[0,15,3.0],[0,16,4.0],[0,17,25.0],[0,21,1925.0],[1,7,409.0],[1,9,726.0],[1,10,193.0],[1,12,1.0],[1,13,220.0],[1,14,68.0],[1,15,59.0],[1,16,76.0],[1,19,121.0],[1,20,3.0],[1,21,7.0],[2,16,15.0],[2,17,1.0],[2,19,1449.0],[2,20,89.0],[2,21,40.0],[3,0,1.0],[3,16,7.0],[3,17,5.0],[3,19,791.0],[3,20,6.0],[3,21,166.0],[4,7,8.0],[4,9,25.0],[4,15,13.0],[4,16,18.0],[4,17,1.0],[4,19,275.0],[4,20,44.0],[4,21,48.0],[5,7,40.0],[5,9,56.0],[5,10,12.0],[5,13,12.0],[5,14,4.0],[5,15,7.0],[5,16,13.0],[5,19,226.0],[5,20,5.0],[5,21,6.0],[6,7,24.0],[6,9,147.0],[6,10,29.0],[6,13,12.0],[6,14,8.0],[6,15,3.0],[6,19,2.0],[6,20,1.0],[6,21,8.0],[7,7,18.0],[7,9,35.0],[7,10,30.0],[7,13,11.0],[7,14,26.0],[7,15,5.0],[7,16,3.0],[7,19,10.0],[7,20,8.0],[7,21,7.0],[8,16,2.0],[8,17,1.0],[8,19,85.0],[8,20,1.0],[8,21,45.0],[9,17,2.0],[9,20,13.0],[9,21,79.0],[10,7,1.0],[10,9,11.0],[10,13,34.0],[10,14,1.0],[11,7,1.0],[11,13,21.0],[11,15,2.0],[11,16,15.0],[12,9,4.0],[12,10,1.0],[12,14,1.0],[12,15,16.0],[12,16,8.0],[13,19,5.0],[13,20,15.0],[13,21,6.0],[14,19,12.0],[14,20,8.0],[14,21,1.0],[15,8,1.0],[15,20,20.0],[16,15,3.0],[16,16,1.0],[16,20,14.0],[17,19,11.0],[17,20,4.0],[18,9,3.0],[18,13,3.0],[18,15,2.0],[18,16,7.0],[19,7,2.0],[19,9,5.0],[19,14,5.0],[20,19,10.0],[20,20,1.0],[21,19,4.0],[21,20,4.0],[21,21,1.0],[22,19,7.0],[22,21,1.0],[23,7,5.0],[23,15,1.0],[23,16,1.0],[24,21,7.0],[25,19,5.0]],"rows": [{"id": "1", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "2", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "3", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__uncultured_marine_eukaryote", "g__", "s__"]}},{"id": "4", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "6", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "7", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "8", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "9", "metadata": {"taxonomy": ["k__N", "p__o", "c___", "o__h", "f__i", "g__t", "s__"]}},{"id": "10", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "14", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "17", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "18", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "19", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "21", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "24", "metadata": {"taxonomy": ["k__Bacteria", "p__Bacteroidetes", "c__Bacteroidia", "o__Flavobacteriales", "f__Flavobacteriaceae", "g__Pseudofulvibacter", "s__*"]}},{"id": "25", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Synechococcales", "f__Cyanobiaceae", "g__*", "s__*"]}},{"id": "26", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__uncultured_bacterium", "g__", "s__"]}},{"id": "28", "metadata": {"taxonomy": ["k__Bacteria", "p__Proteobacteria", "c__Gammaproteobacteria", "o__Betaproteobacteriales", "f__Burkholderiaceae", "g__Limnobacter", "s__*"]}},{"id": "29", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "32", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "33", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "37", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "40", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "41", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "42", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__*", "g__", "s__"]}},{"id": "51", "metadata": {"taxonomy": ["k__Bacteria", "p__Cyanobacteria", "c__Oxyphotobacteria", "o__Chloroplast", "f__uncultured_marine_eukaryote", "g__", "s__"]}}],"columns": [{"id": "PM-18S6ext_TATGCG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "TATGCG"}},{"id": "PM-18SA_TTTTTC_p16S", "metadata": {"DOB": "", "BarcodeSequence": "TTTTTC"}},{"id": "PM-18SB_GTCGTG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "GTCGTG"}},{"id": "PM-18SD_GCTATC_p16S", "metadata": {"DOB": "", "BarcodeSequence": "GCTATC"}},{"id": "PM-18SLAZ_TTTGTA_p16S", "metadata": {"DOB": "", "BarcodeSequence": "TTTGTA"}},{"id": "PM-18SMIS_AGCCTG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "AGCCTG"}},{"id": "PM-18SQN_CAACAG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "CAACAG"}},{"id": "PM-finaMIS_CTCTCG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "CTCTCG"}},{"id": "PM-finaTfA_TGCTGA_p16S", "metadata": {"DOB": "", "BarcodeSequence": "TGCTGA"}},{"id": "PM-final6ext_GGTAGC_p16S", "metadata": {"DOB": "", "BarcodeSequence": "GGTAGC"}},{"id": "PM-finalA_AATATG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "AATATG"}},{"id": "PM-finalB_TGAGCA_p16S", "metadata": {"DOB": "", "BarcodeSequence": "TGAGCA"}},{"id": "PM-finalD_AATCAC_p16S", "metadata": {"DOB": "", "BarcodeSequence": "AATCAC"}},{"id": "PM-finalLAZ_GCAAAT_p16S", "metadata": {"DOB": "", "BarcodeSequence": "GCAAAT"}},{"id": "PM-finalQN_AAATTG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "AAATTG"}},{"id": "PM-finalT041p_CTGTGC_p16S", "metadata": {"DOB": "", "BarcodeSequence": "CTGTGC"}},{"id": "PM-finalT0MIS_ACATAT_p16S", "metadata": {"DOB": "", "BarcodeSequence": "ACATAT"}},{"id": "PM-finalTfC_GAGCTT_p16S", "metadata": {"DOB": "", "BarcodeSequence": "GAGCTT"}},{"id": "PM-finalTfD_TCTCGG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "TCTCGG"}},{"id": "PM-finalTfE_AGCGAC_p16S", "metadata": {"DOB": "", "BarcodeSequence": "AGCGAC"}},{"id": "PM-finalTfF_GCCAAG_p16S", "metadata": {"DOB": "", "BarcodeSequence": "GCCAAG"}},{"id": "PM-finalTfG_AGGTTC_p16S", "metadata": {"DOB": "", "BarcodeSequence": "AGGTTC"}}]}
Затем файл biom использовался с Tax4Fun, но я получаю следующее сообщение об ошибке:
biom16S = importQIIMEBiomData('all_p16S.OTU.filtered.annotated.biom')
Error in rowsum.default(as.matrix(taxProfile), ModSilvaIds) :
incorrect length for 'group'
Я попробовал другой способ: я преобразовал таблицу OTU в текстовый формат, сгенерированный QIIME, и использовал данные с Tax4Fun, но снова получаю ошибку:
biom convert --to-tsv -i "all_p16S.OTU.filtered.annotated.biom" -o "filtered.annotated.txt" --table-type="OTU table" --header-key taxonomy
folderReferenceData = "SILVA119"
biom16S = importQIIMEData("filtered.annotated.txt")
Tax4FunOutput = Tax4Fun(biom16S, folderReferenceData)
Error in `rownames<-`(`*tmp*`, value = c("PM-18S6ext_TATGCG_p16S", "PM-18SA_TTTTTC_p16S", :
length of 'dimnames' [1] not equal to array extent
TXT-файл выглядит так:
# Constructed from biom file
#OTU ID PM-18S6ext_TATGCG_p16S PM-18SA_TTTTTC_p16S PM-18SB_GTCGTG_p16S
PM-18SD_GCTATC_p16S PM-18SLAZ_TTTGTA_p16S PM-18SMIS_AGCCTG_p16S PM-18SQN_CAACAG_p16S PM-finaMIS_CTCTCG_p16S PM-finaTfA_TGCTGA_p16S PM-final6ext_GGTAGC_p16S PM-finalA_AATATG_p16S PM-finalB_TGAGCA_p16SPM-finalD_AATCAC_p16S PM-finalLAZ_GCAAAT_p16S PM-finalQN_AAATTG_p16S PM-finalT041p_CTGTGC_p16S PM-finalT0MIS_ACATAT_p16S PM-finalTfC_GAGCTT_p16S PM-finalTfD_TCTCGG_p16S PM-finalTfE_AGCGAC_p16S PM-finalTfF_GCCAAG_p16S PM-finalTfG_AGGTTC_p16S taxonomy
1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 18.0 0.0 27.0 1.0 0.0 0.0 0.0 4.0 3.0 4.0 25.0 0.0 0.0 0.0 1925.0 k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__*; g__; s__
2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 409.0 0.0 726.0 193.0 0.0 1.0 220.0 68.0 59.0 76.0 0.0 0.0 121.0 3.0 7.0 k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__*; g__; s__
3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 15.0 1.0 0.0 1449.0 89.0 40.0 k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__uncultured_marine_eukaryote; g__; s__
4 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 7.0 5.0 0.0 791.0 6.0 166.0 k__Bacteria; p__Cyanobacteria; c__Oxyphotobacteria; o__Chloroplast; f__*; g__; s__
Я искал в Google, и, похоже, проблема в том, как отформатирован столбец таксономии. Однако, когда я проверяю это, это выглядит хорошо:
x = read_biom('all_p16S.OTU.filtered.annotated.biom')
head(biom_data(x))
6 x 22 sparse Matrix of class "dgCMatrix"
[[ suppressing 22 column names ‘PM-18S6ext_TATGCG_p16S’, ‘PM-18SA_TTTTTC_p16S’, ‘PM-18SB_GTCGTG_p16S’ ... ]]
1 . . . . . . . 18 . 27 1 . . . 4 3 4 25 . . .
2 . . . . . . . 409 . 726 193 . 1 220 68 59 76 . . 121 3
3 . . . . . . . . . . . . . . . . 15 1 . 1449 89
4 1 . . . . . . . . . . . . . . . 7 5 . 791 6
6 . . . . . . . 8 . 25 . . . . . 13 18 1 . 275 44
7 . . . . . . . 40 . 56 12 . . 12 4 7 13 . . 226 5
taxmat = as.matrix(observation_metadata(x), rownames.force=TRUE)
taxmat
taxonomy1 taxonomy2 taxonomy3
1 "k__Bacteria" "p__Cyanobacteria" "c__Oxyphotobacteria"
2 "k__Bacteria" "p__Cyanobacteria" "c__Oxyphotobacteria"
3 "k__Bacteria" "p__Cyanobacteria" "c__Oxyphotobacteria"
4 "k__Bacteria" "p__Cyanobacteria" "c__Oxyphotobacteria"
6 "k__Bacteria" "p__Cyanobacteria" "c__Oxyphotobacteria"
7 "k__Bacteria" "p__Cyanobacteria" "c__Oxyphotobacteria"
Все предложения приветствуются!
0 ответов
У меня была похожая проблема, когда я начал использовать Tax4Fun. От изучения сети, решение, которое я основал (и оно работало для меня), состояло в том, чтобы изменить способ написания таксономий. То есть удалите "k___", "p___" и так далее. Примером таксономии должно быть "Бактерии, цианобактерии, оксифобактерии". Кроме того, я прочитал, что он работает лучше при работе с форматом TSV вместо формата BIOM.
Если вам нужна дополнительная помощь, у меня есть код, который я использовал для вывода функций с Tax4Fun
Лучший!