Запрос panda df для фильтрации строк, где столбец не является Nan
Я новичок в питоне и использую панд.
Я хочу запросить фрейм данных и отфильтровать строки, где один из столбцов не NaN
,
Я пытался:
a=dictionarydf.label.isnull()
но населен true
или же false
, Пробовал это
dictionarydf.query(dictionarydf.label.isnull())
но дал ошибку как я и ожидал
Пример данных:
reference_word all_matching_words label review
0 account fees - account NaN N
1 account mobile - account NaN N
2 account monthly - account NaN N
3 administration delivery - administration NaN N
4 administration fund - administration NaN N
5 advisor fees - advisor NaN N
6 advisor optimum - advisor NaN N
7 advisor sub - advisor NaN N
8 aichi delivery - aichi NaN N
9 aichi pref - aichi NaN N
10 airport biz - airport travel N
11 airport cfo - airport travel N
12 airport cfomtg - airport travel N
13 airport meeting - airport travel N
14 airport summit - airport travel N
15 airport taxi - airport travel N
16 airport train - airport travel N
17 airport transfer - airport travel N
18 airport trip - airport travel N
19 ais admin - ais NaN N
20 ais alpine - ais NaN N
21 ais fund - ais NaN N
22 allegiance custody - allegiance NaN N
23 allegiance fees - allegiance NaN N
24 alpha late - alpha NaN N
25 alpha meal - alpha NaN N
26 alpha taxi - alpha NaN N
27 alpine admin - alpine NaN N
28 alpine ais - alpine NaN N
29 alpine fund - alpine NaN N
Я хочу отфильтровать данные, где метка не NaN
ожидаемый результат:
reference_word all_matching_words label review
0 airport biz - airport travel N
1 airport cfo - airport travel N
2 airport cfomtg - airport travel N
3 airport meeting - airport travel N
4 airport summit - airport travel N
5 airport taxi - airport travel N
6 airport train - airport travel N
7 airport transfer - airport travel N
8 airport trip - airport travel N
1 ответ
Решение
Ты можешь использовать dropna
:
df = df.dropna(subset=['label'])
print (df)
reference_word all_matching_words label review
10 airport biz - airport travel N
11 airport cfo - airport travel N
12 airport cfomtg - airport travel N
13 airport meeting - airport travel N
14 airport summit - airport travel N
15 airport taxi - airport travel N
16 airport train - airport travel N
17 airport transfer - airport travel N
18 airport trip - airport travel N
Другое решение - boolean indexing
с notnull
:
df = df[df.label.notnull()]
print (df)
reference_word all_matching_words label review
10 airport biz - airport travel N
11 airport cfo - airport travel N
12 airport cfomtg - airport travel N
13 airport meeting - airport travel N
14 airport summit - airport travel N
15 airport taxi - airport travel N
16 airport train - airport travel N
17 airport transfer - airport travel N
18 airport trip - airport travel N