Создание набора данных с использованием Pandas

Учитывая CSV-файл...

neg,,,,,,,
SAMPLE 1,,SAMPLE 2,,SAMPLE 3,,SAMPLE 4,
50.0261,2.17E+02,50.0224,3.31E+02,50.0007,5.38E+02,50.0199,2.39E+02
50.1057,2.65E+02,50.0435,3.92E+02,50.0657,5.52E+02,50.0465,3.37E+02
50.1514,2.90E+02,50.0781,3.88E+02,50.1115,5.75E+02,50.0584,2.58E+02
50.166,3.85E+02,50.1245,4.25E+02,50.1258,5.11E+02,50.0765,4.47E+02
50.1831,2.55E+02,50.1748,3.71E+02,50.1411,6.21E+02,50.1246,1.43E+02
50.2023,3.45E+02,50.2161,2.59E+02,50.1671,5.56E+02,50.1866,3.77E+02
50.223,4.02E+02,50.2381,4.33E+02,50.1968,6.31E+02,50.2276,3.41E+02
50.2631,1.89E+02,50.2826,4.63E+02,50.211,3.92E+02,50.2717,4.71E+02
50.2922,2.72E+02,50.3593,4.52E+02,50.2279,5.92E+02,50.376,3.09E+02
50.319,2.46E+02,50.4019,4.15E+02,50.2929,5.60E+02,50.3979,2.56E+02
50.3523,3.57E+02,50.423,3.31E+02,50.3659,4.84E+02,50.4237,3.28E+02
50.3968,4.67E+02,50.4402,1.76E+02,50.437,1.89E+02,50.4504,2.71E+02
50.4431,1.88E+02,50.479,4.85E+02,50.5137,6.63E+02,50.5078,2.54E+02
50.481,3.63E+02,50.5448,3.51E+02,50.5401,5.11E+02,50.5436,2.69E+02
50.506,3.73E+02,50.5872,4.03E+02,50.5593,6.56E+02,50.555,3.06E+02
50.5379,3.00E+02,50.6076,2.96E+02,50.6034,5.02E+02,50.6059,2.83E+02
50.5905,2.38E+02,50.6341,2.67E+02,50.6579,6.37E+02,50.6484,1.99E+02
50.6564,1.30E+02,50.662,3.53E+02,50.6888,7.37E+02,50.7945,4.84E+02
50.7428,2.38E+02,50.6952,4.21E+02,50.7132,6.71E+02,50.8044,4.41E+02
50.8052,3.67E+02,50.7397,1.99E+02,50.7421,6.29E+02,50.8213,1.69E+02
50.8459,2.80E+02,50.7685,3.73E+02,50.7872,5.30E+02,50.8401,3.88E+02
50.9021,3.56E+02,50.7757,4.54E+02,50.8251,4.13E+02,50.8472,3.61E+02
50.9425,3.89E+02,50.8027,7.20E+02,50.8418,5.73E+02,50.8893,1.18E+02
51.0117,2.29E+02,50.8206,2.93E+02,50.8775,4.34E+02,50.9285,2.64E+02
51.0244,5.19E+02,50.8364,4.80E+02,50.9101,4.25E+02,50.9591,1.64E+02
51.0319,3.62E+02,50.8619,2.90E+02,50.9222,5.11E+02,51.0034,2.70E+02
51.0439,4.24E+02,50.9098,3.22E+02,50.9675,4.33E+02,51.0577,2.88E+02
51.0961,3.59E+02,50.969,3.87E+02,51.0123,6.03E+02,51.0712,3.18E+02
51.1429,2.49E+02,51.0009,2.42E+02,51.0266,7.30E+02,51.1015,1.84E+02
51.1597,2.71E+02,51.0262,1.32E+02,51.0554,3.69E+02,51.1291,3.71E+02
51.177,2.84E+02,51.0778,1.58E+02,51.1113,4.50E+02,51.1378,3.54E+02
51.1924,2.00E+02,51.1313,4.07E+02,51.1464,3.86E+02,51.1871,1.55E+02
51.2055,2.25E+02,51.1844,2.08E+02,51.1826,7.06E+02,51.2511,2.05E+02
51.2302,3.81E+02,51.2197,5.49E+02,51.2284,7.00E+02,51.3036,2.60E+02
51.264,2.16E+02,51.2306,3.76E+02,51.271,3.83E+02,51.3432,1.99E+02
51.2919,2.29E+02,51.2468,2.87E+02,51.308,3.89E+02,51.3775,2.45E+02
51.3338,3.67E+02,51.2739,5.56E+02,51.3394,5.17E+02,51.3977,3.86E+02
51.3743,2.57E+02,51.3228,3.18E+02,51.3619,6.03E+02,51.4151,3.37E+02
51.3906,3.78E+02,51.3685,2.33E+02,51.3844,4.44E+02,51.4254,2.72E+02
51.4112,3.29E+02,51.3912,5.03E+02,51.4179,5.68E+02,51.4426,3.17E+02
51.4423,1.86E+02,51.4165,2.68E+02,51.4584,5.10E+02,51.4834,3.87E+02
51.537,3.48E+02,51.4645,3.76E+02,51.5179,5.75E+02,51.544,4.37E+02
51.637,4.51E+02,51.5078,2.76E+02,51.569,4.73E+02,51.5554,4.52E+02
51.665,2.27E+02,51.5388,2.51E+02,51.5894,4.57E+02,51.5958,1.96E+02
51.6925,5.60E+02,51.5486,2.79E+02,51.614,4.88E+02,51.6329,5.40E+02
51.7409,4.19E+02,51.5584,2.53E+02,51.6458,5.72E+02,51.6477,3.23E+02
51.7851,4.29E+02,51.5961,2.72E+02,51.7076,4.36E+02,51.6577,2.70E+02
51.8176,3.11E+02,51.6608,2.04E+02,51.776,5.59E+02,51.6699,3.89E+02
51.8764,3.94E+02,51.7093,5.14E+02,51.8157,6.66E+02,51.6788,2.83E+02
51.9135,3.26E+02,51.7396,1.88E+02,51.8514,4.26E+02,51.7201,3.91E+02
51.9592,2.66E+02,51.7931,2.72E+02,51.8791,5.61E+02,51.7546,3.41E+02
51.9954,2.97E+02,51.8428,5.96E+02,51.9129,5.14E+02,51.7646,2.27E+02
52.0751,2.24E+02,51.8923,3.94E+02,51.959,5.18E+02,51.7801,1.43E+02
52.1456,3.26E+02,51.9177,2.82E+02,52.0116,4.21E+02,51.8022,2.27E+02
52.1846,3.42E+02,51.9265,3.21E+02,52.0848,5.10E+02,51.83,2.66E+02
52.2284,2.66E+02,51.9413,3.56E+02,52.1412,6.20E+02,51.8698,1.74E+02
52.2666,5.32E+02,51.9616,2.19E+02,52.1722,5.72E+02,51.9084,2.89E+02
52.2936,4.24E+02,51.9845,1.53E+02,52.1821,5.18E+02,51.937,1.69E+02
52.3256,3.69E+02,52.0051,3.53E+02,52.2473,5.51E+02,51.9641,3.31E+02
52.3566,2.50E+02,52.0299,2.87E+02,52.3103,4.12E+02,52.0292,2.63E+02
52.4192,3.08E+02,52.0603,3.15E+02,52.35,8.76E+02,52.0633,3.94E+02
52.4757,2.99E+02,52.0988,3.45E+02,52.3807,6.95E+02,52.0797,2.88E+02
52.498,2.37E+02,52.1176,3.63E+02,52.4234,4.89E+02,52.1073,2.97E+02
52.57,2.58E+02,52.1698,3.11E+02,52.4451,4.54E+02,52.1546,3.41E+02
52.6178,4.29E+02,52.2352,3.96E+02,52.4627,5.38E+02,52.2219,3.68E+02

Как можно разделить выборки, используя перекрывающиеся ячейки 0,25 m/z - где первый столбец каждого кортежа (Sample n,,) содержит значение a m/z, а второй - вес?

Чтобы загрузить файл в DataFrame Pandas, я сейчас делаю:

import csv, pandas as pd

def load_raw_data():
    raw_data = []
    with open("negsmaller.csv", "rb") as rawfile:
        reader = csv.reader(rawfile, delimiter=",")
        next(reader)
        for row in reader:
            raw_data.append(row)

    raw_data = pd.DataFrame(raw_data)

    return raw_data.T

if __name__ == '__main__':
    raw_data = load_raw_data()

    print raw_data

Который возвращается

         0         1         2         3         4         5         6   \
0  SAMPLE 1   50.0261   50.1057   50.1514    50.166   50.1831   50.2023   
1            2.17E+02  2.65E+02  2.90E+02  3.85E+02  2.55E+02  3.45E+02   
2  SAMPLE 2   50.0224   50.0435   50.0781   50.1245   50.1748   50.2161   
3            3.31E+02  3.92E+02  3.88E+02  4.25E+02  3.71E+02  2.59E+02   
4  SAMPLE 3   50.0007   50.0657   50.1115   50.1258   50.1411   50.1671   
5            5.38E+02  5.52E+02  5.75E+02  5.11E+02  6.21E+02  5.56E+02   
6  SAMPLE 4   50.0199   50.0465   50.0584   50.0765   50.1246   50.1866   
7            2.39E+02  3.37E+02  2.58E+02  4.47E+02  1.43E+02  3.77E+02   

         7         8         9     ...           56        57        58  \
0    50.223   50.2631   50.2922    ...      52.2284   52.2666   52.2936   
1  4.02E+02  1.89E+02  2.72E+02    ...     2.66E+02  5.32E+02  4.24E+02   
2   50.2381   50.2826   50.3593    ...      51.9413   51.9616   51.9845   
3  4.33E+02  4.63E+02  4.52E+02    ...     3.56E+02  2.19E+02  1.53E+02   
4   50.1968    50.211   50.2279    ...      52.1412   52.1722   52.1821   
5  6.31E+02  3.92E+02  5.92E+02    ...     6.20E+02  5.72E+02  5.18E+02   
6   50.2276   50.2717    50.376    ...      51.8698   51.9084    51.937   
7  3.41E+02  4.71E+02  3.09E+02    ...     1.74E+02  2.89E+02  1.69E+02   

         59        60        61        62        63        64        65  
0   52.3256   52.3566   52.4192   52.4757    52.498     52.57   52.6178  
1  3.69E+02  2.50E+02  3.08E+02  2.99E+02  2.37E+02  2.58E+02  4.29E+02  
2   52.0051   52.0299   52.0603   52.0988   52.1176   52.1698   52.2352  
3  3.53E+02  2.87E+02  3.15E+02  3.45E+02  3.63E+02  3.11E+02  3.96E+02  
4   52.2473   52.3103     52.35   52.3807   52.4234   52.4451   52.4627  
5  5.51E+02  4.12E+02  8.76E+02  6.95E+02  4.89E+02  4.54E+02  5.38E+02  
6   51.9641   52.0292   52.0633   52.0797   52.1073   52.1546   52.2219  
7  3.31E+02  2.63E+02  3.94E+02  2.88E+02  2.97E+02  3.41E+02  3.68E+02  

[8 rows x 66 columns]

Process finished with exit code 0

Мой желаемый результат: взять перекрывающиеся 0,25 бинов, а затем взять среднее значение столбца рядом с ним и получить его как единое целое. Так,

0.01    3
0.10    4
0.24    2

станет 0,25 3

0 ответов

Другие вопросы по тегам