Сценарий оболочки для проверки файла CSV столбец за столбцом

Question

Сценарий оболочки для проверки файла CSV столбец за столбцом

Мне было интересно, как я мог бы написать это в оболочке? Я хочу проверить поле в csv файле coulmn с помощью coulmn. Например, хотите подтвердить, только если номер один номер

Number,Letter

1,u
2,h
3,d
4,j

выше

Loop - for all files (loop1)

 loop  from rows(2-n) (loop2) #skipping first row since its a header

     validate column 1

     validate column 2

     ...

     end loop2
           if( file pass validation)
               copy to goodFile directory
           else(
                send to badFile directory


 end loop1

Ниже у меня есть проверка строки за строкой, какая модификация мне понадобится, чтобы сделать ее похожей на приведенный выше код psuedo. Я ужасен в Unix, только начал изучать awk.

#!/bin/sh

for file in /source/*.csv

 do
   awk -F"," '{                       # awk -F", " {'print$2'} to get the     fields.
$date_regex = '~(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d~';

if (length($1) == "")  
    break
if (length($2) == "") && (length($2) > 30)
    break
if (length($3) == "") && ($3 !~ /$date_regex/)
    break
if (length($4) == "") && (($4 != "S") || ($4 != "E")   
    break
if (length($5) == "") && ((length($5) < 9 || (length($5) > 11)))
    break



}' file

   #whatever you need with "$file"

сделанный

1

shell csv scripting

Источник

user4501028 06 фев '15 в 20:01

2 ответа

Решение

Предполагая, что в файле нет случайных пробелов, вот как я это сделаю в bash.

# validate: first field is an integer
# validate: 2nd field is a lower-case letter

for file in *.csv; do
    good=true
    while IFS=, read -ra fields; do
        if [[ ! ( 
                  ${fields[0]} =~ ^[+-]?[[:digit:]]+$
                  && ${fields[1]} == [a-z]
                ) ]]
        then
            good=false
            break
        fi
    done < "$file"
    if $good; then
        : # handle good file
    else
        : # handle bad file
    fi
done

1

Источник

user7552 06 фев '15 в 20:52

Другие вопросы по тегам shell csv scripting

user3220113 06 фев '15 в 20:56 2015-02-06 20:56 · Accepted Answer · 2015-02-06 20:56

Я объединю два разных способа написания цикла. Строки, начинающиеся с #, являются комментариями:

# Read all files. I hope no file have spaces in their names
for file in /source/*.csv ; do
   # init two variables before processing a new file
   FILESTATUS=GOOD
   FIRSTROW=true
   # process file 1 line a time, splitting the line by the 
   # Internal Field Sep ,
   cat "${file}" | while IFS=, read field1 field2; do
      # Skip first line, the header row
      if [ "${FIRSTROW}" = "true" ]; then
         FIRSTROW=FALSE
         # skip processing of this line, continue with next record
         continue;
      fi

      # Lot of different checks possible here
      # Can google them easy (check field integer)
      if [[ "${field1}" = somestringprefix*  ]]; then
         ${FILESTATUS}=BAD
         # Stop inner loop
         break
      fi
      somecheckonField2
   done
   if [ ${FILESTATUS} = "GOOD" ] ; then
      mv ${file} /source/good
   else
      mv ${file} /source/bad
   fi
done