Создание набора данных с использованием Pandas
Учитывая CSV-файл...
neg,,,,,,,
SAMPLE 1,,SAMPLE 2,,SAMPLE 3,,SAMPLE 4,
50.0261,2.17E+02,50.0224,3.31E+02,50.0007,5.38E+02,50.0199,2.39E+02
50.1057,2.65E+02,50.0435,3.92E+02,50.0657,5.52E+02,50.0465,3.37E+02
50.1514,2.90E+02,50.0781,3.88E+02,50.1115,5.75E+02,50.0584,2.58E+02
50.166,3.85E+02,50.1245,4.25E+02,50.1258,5.11E+02,50.0765,4.47E+02
50.1831,2.55E+02,50.1748,3.71E+02,50.1411,6.21E+02,50.1246,1.43E+02
50.2023,3.45E+02,50.2161,2.59E+02,50.1671,5.56E+02,50.1866,3.77E+02
50.223,4.02E+02,50.2381,4.33E+02,50.1968,6.31E+02,50.2276,3.41E+02
50.2631,1.89E+02,50.2826,4.63E+02,50.211,3.92E+02,50.2717,4.71E+02
50.2922,2.72E+02,50.3593,4.52E+02,50.2279,5.92E+02,50.376,3.09E+02
50.319,2.46E+02,50.4019,4.15E+02,50.2929,5.60E+02,50.3979,2.56E+02
50.3523,3.57E+02,50.423,3.31E+02,50.3659,4.84E+02,50.4237,3.28E+02
50.3968,4.67E+02,50.4402,1.76E+02,50.437,1.89E+02,50.4504,2.71E+02
50.4431,1.88E+02,50.479,4.85E+02,50.5137,6.63E+02,50.5078,2.54E+02
50.481,3.63E+02,50.5448,3.51E+02,50.5401,5.11E+02,50.5436,2.69E+02
50.506,3.73E+02,50.5872,4.03E+02,50.5593,6.56E+02,50.555,3.06E+02
50.5379,3.00E+02,50.6076,2.96E+02,50.6034,5.02E+02,50.6059,2.83E+02
50.5905,2.38E+02,50.6341,2.67E+02,50.6579,6.37E+02,50.6484,1.99E+02
50.6564,1.30E+02,50.662,3.53E+02,50.6888,7.37E+02,50.7945,4.84E+02
50.7428,2.38E+02,50.6952,4.21E+02,50.7132,6.71E+02,50.8044,4.41E+02
50.8052,3.67E+02,50.7397,1.99E+02,50.7421,6.29E+02,50.8213,1.69E+02
50.8459,2.80E+02,50.7685,3.73E+02,50.7872,5.30E+02,50.8401,3.88E+02
50.9021,3.56E+02,50.7757,4.54E+02,50.8251,4.13E+02,50.8472,3.61E+02
50.9425,3.89E+02,50.8027,7.20E+02,50.8418,5.73E+02,50.8893,1.18E+02
51.0117,2.29E+02,50.8206,2.93E+02,50.8775,4.34E+02,50.9285,2.64E+02
51.0244,5.19E+02,50.8364,4.80E+02,50.9101,4.25E+02,50.9591,1.64E+02
51.0319,3.62E+02,50.8619,2.90E+02,50.9222,5.11E+02,51.0034,2.70E+02
51.0439,4.24E+02,50.9098,3.22E+02,50.9675,4.33E+02,51.0577,2.88E+02
51.0961,3.59E+02,50.969,3.87E+02,51.0123,6.03E+02,51.0712,3.18E+02
51.1429,2.49E+02,51.0009,2.42E+02,51.0266,7.30E+02,51.1015,1.84E+02
51.1597,2.71E+02,51.0262,1.32E+02,51.0554,3.69E+02,51.1291,3.71E+02
51.177,2.84E+02,51.0778,1.58E+02,51.1113,4.50E+02,51.1378,3.54E+02
51.1924,2.00E+02,51.1313,4.07E+02,51.1464,3.86E+02,51.1871,1.55E+02
51.2055,2.25E+02,51.1844,2.08E+02,51.1826,7.06E+02,51.2511,2.05E+02
51.2302,3.81E+02,51.2197,5.49E+02,51.2284,7.00E+02,51.3036,2.60E+02
51.264,2.16E+02,51.2306,3.76E+02,51.271,3.83E+02,51.3432,1.99E+02
51.2919,2.29E+02,51.2468,2.87E+02,51.308,3.89E+02,51.3775,2.45E+02
51.3338,3.67E+02,51.2739,5.56E+02,51.3394,5.17E+02,51.3977,3.86E+02
51.3743,2.57E+02,51.3228,3.18E+02,51.3619,6.03E+02,51.4151,3.37E+02
51.3906,3.78E+02,51.3685,2.33E+02,51.3844,4.44E+02,51.4254,2.72E+02
51.4112,3.29E+02,51.3912,5.03E+02,51.4179,5.68E+02,51.4426,3.17E+02
51.4423,1.86E+02,51.4165,2.68E+02,51.4584,5.10E+02,51.4834,3.87E+02
51.537,3.48E+02,51.4645,3.76E+02,51.5179,5.75E+02,51.544,4.37E+02
51.637,4.51E+02,51.5078,2.76E+02,51.569,4.73E+02,51.5554,4.52E+02
51.665,2.27E+02,51.5388,2.51E+02,51.5894,4.57E+02,51.5958,1.96E+02
51.6925,5.60E+02,51.5486,2.79E+02,51.614,4.88E+02,51.6329,5.40E+02
51.7409,4.19E+02,51.5584,2.53E+02,51.6458,5.72E+02,51.6477,3.23E+02
51.7851,4.29E+02,51.5961,2.72E+02,51.7076,4.36E+02,51.6577,2.70E+02
51.8176,3.11E+02,51.6608,2.04E+02,51.776,5.59E+02,51.6699,3.89E+02
51.8764,3.94E+02,51.7093,5.14E+02,51.8157,6.66E+02,51.6788,2.83E+02
51.9135,3.26E+02,51.7396,1.88E+02,51.8514,4.26E+02,51.7201,3.91E+02
51.9592,2.66E+02,51.7931,2.72E+02,51.8791,5.61E+02,51.7546,3.41E+02
51.9954,2.97E+02,51.8428,5.96E+02,51.9129,5.14E+02,51.7646,2.27E+02
52.0751,2.24E+02,51.8923,3.94E+02,51.959,5.18E+02,51.7801,1.43E+02
52.1456,3.26E+02,51.9177,2.82E+02,52.0116,4.21E+02,51.8022,2.27E+02
52.1846,3.42E+02,51.9265,3.21E+02,52.0848,5.10E+02,51.83,2.66E+02
52.2284,2.66E+02,51.9413,3.56E+02,52.1412,6.20E+02,51.8698,1.74E+02
52.2666,5.32E+02,51.9616,2.19E+02,52.1722,5.72E+02,51.9084,2.89E+02
52.2936,4.24E+02,51.9845,1.53E+02,52.1821,5.18E+02,51.937,1.69E+02
52.3256,3.69E+02,52.0051,3.53E+02,52.2473,5.51E+02,51.9641,3.31E+02
52.3566,2.50E+02,52.0299,2.87E+02,52.3103,4.12E+02,52.0292,2.63E+02
52.4192,3.08E+02,52.0603,3.15E+02,52.35,8.76E+02,52.0633,3.94E+02
52.4757,2.99E+02,52.0988,3.45E+02,52.3807,6.95E+02,52.0797,2.88E+02
52.498,2.37E+02,52.1176,3.63E+02,52.4234,4.89E+02,52.1073,2.97E+02
52.57,2.58E+02,52.1698,3.11E+02,52.4451,4.54E+02,52.1546,3.41E+02
52.6178,4.29E+02,52.2352,3.96E+02,52.4627,5.38E+02,52.2219,3.68E+02
Как можно разделить выборки, используя перекрывающиеся ячейки 0,25 m/z - где первый столбец каждого кортежа (Sample n,,) содержит значение a m/z, а второй - вес?
Чтобы загрузить файл в DataFrame Pandas, я сейчас делаю:
import csv, pandas as pd
def load_raw_data():
raw_data = []
with open("negsmaller.csv", "rb") as rawfile:
reader = csv.reader(rawfile, delimiter=",")
next(reader)
for row in reader:
raw_data.append(row)
raw_data = pd.DataFrame(raw_data)
return raw_data.T
if __name__ == '__main__':
raw_data = load_raw_data()
print raw_data
Который возвращается
0 1 2 3 4 5 6 \
0 SAMPLE 1 50.0261 50.1057 50.1514 50.166 50.1831 50.2023
1 2.17E+02 2.65E+02 2.90E+02 3.85E+02 2.55E+02 3.45E+02
2 SAMPLE 2 50.0224 50.0435 50.0781 50.1245 50.1748 50.2161
3 3.31E+02 3.92E+02 3.88E+02 4.25E+02 3.71E+02 2.59E+02
4 SAMPLE 3 50.0007 50.0657 50.1115 50.1258 50.1411 50.1671
5 5.38E+02 5.52E+02 5.75E+02 5.11E+02 6.21E+02 5.56E+02
6 SAMPLE 4 50.0199 50.0465 50.0584 50.0765 50.1246 50.1866
7 2.39E+02 3.37E+02 2.58E+02 4.47E+02 1.43E+02 3.77E+02
7 8 9 ... 56 57 58 \
0 50.223 50.2631 50.2922 ... 52.2284 52.2666 52.2936
1 4.02E+02 1.89E+02 2.72E+02 ... 2.66E+02 5.32E+02 4.24E+02
2 50.2381 50.2826 50.3593 ... 51.9413 51.9616 51.9845
3 4.33E+02 4.63E+02 4.52E+02 ... 3.56E+02 2.19E+02 1.53E+02
4 50.1968 50.211 50.2279 ... 52.1412 52.1722 52.1821
5 6.31E+02 3.92E+02 5.92E+02 ... 6.20E+02 5.72E+02 5.18E+02
6 50.2276 50.2717 50.376 ... 51.8698 51.9084 51.937
7 3.41E+02 4.71E+02 3.09E+02 ... 1.74E+02 2.89E+02 1.69E+02
59 60 61 62 63 64 65
0 52.3256 52.3566 52.4192 52.4757 52.498 52.57 52.6178
1 3.69E+02 2.50E+02 3.08E+02 2.99E+02 2.37E+02 2.58E+02 4.29E+02
2 52.0051 52.0299 52.0603 52.0988 52.1176 52.1698 52.2352
3 3.53E+02 2.87E+02 3.15E+02 3.45E+02 3.63E+02 3.11E+02 3.96E+02
4 52.2473 52.3103 52.35 52.3807 52.4234 52.4451 52.4627
5 5.51E+02 4.12E+02 8.76E+02 6.95E+02 4.89E+02 4.54E+02 5.38E+02
6 51.9641 52.0292 52.0633 52.0797 52.1073 52.1546 52.2219
7 3.31E+02 2.63E+02 3.94E+02 2.88E+02 2.97E+02 3.41E+02 3.68E+02
[8 rows x 66 columns]
Process finished with exit code 0
Мой желаемый результат: взять перекрывающиеся 0,25 бинов, а затем взять среднее значение столбца рядом с ним и получить его как единое целое. Так,
0.01 3
0.10 4
0.24 2
станет 0,25 3