How to do a row based process using Deedle (Frame in and Frame out)

I am trying to use Deedle to do row based process on a DataFrame. But i just can't get my mind tuned into the Deedle way.

Say for a Frame like

    Indicator1 Indicator2
1   100        200
2   300        500
3   -200       1000

Say there are some rules needs to be applied to each indicator:

  1. if Indicator value is less than 500 and larger than 0, multiply it by 1.1
  2. if Indicator value is less than 0, make it NaN

Я пытался использовать Frame.mapRow.... functions.

Я знаю, что могу использовать

fun v -> let indVal = v.GetAs<Int>("Indicator1");
         let newIndVal = match indVal with 
                         |...... logic  
                         |...... some other logic
         let indVal2 = v.GetAs<Int>("Indicator2");
         let newIndVal2 = match indVal2 with 
                         |...... logic  
                         |...... some other logic  

с Frame.mapRow....

But I am stuck at how to make the newIndVal а также newIndVal2 back into the a row and eventually back into a new data frame.

What I am trying to achieve is a frame in and frame out. Also I only know to process the column one by one (after retrieving them by index or name). If the logic to be applied are generic, is there a way NOT to apply the logic one column by one column?

A imperative (and really simple) way to do this with C or C# 2d array is

loop through the row dimension
    loop through the column dimension
         apply the rule as the side effect to the array[row,col]

How to achieve this in Deedle?

ОБНОВИТЬ:

Leaf Garland's suggestion works great if the calculation doesn't need to reference other columns from the same row. For my case, I need to look at the data row by row, hence I would like to use Frame.mapRows. I should have been clear on the simplified requirements:

Say for a Frame like

    Indicator1 Indicator2
1   100        200
2   <Missing>  500
3   -200       1000
4   100        <Missing>
5   <Missing>  500
6   -200       100

For example if indicator1 is less than 300, new Indicator2 value is Indicator2 + 5% * Indicator1

Мне нужно использовать

mapRows fun k v -> let var1 = v.get("Indicator1")
                   let var2 = v.get("Indicator2")
                   run through the conditions and produce new var1 and var2
                   produce a objectSeries
|> Frame.ofRows

The pesudo code above sounds simple but i just can figure out how to reproduce a proper objectSeries to recreate the Frame.

I also noticed something i can't explain with mapRows function [SO question]: /questions/5114665/deedle-framemaprows-kak-pravilno-ego-ispolzovat-i-kak-pravilno-postroit-seriyu-obektov

Обновить

Since the original question was posted, I have since used Deedle in C#. К моему удивлению, вычисления на основе строк очень просты в C#, и способ, которым функция C# Frame.rows обрабатывает пропущенные значения, сильно отличается от функции F# mapRows. Ниже приведен очень простой пример, который я использовал, чтобы попытаться проверить правильность логики. это может быть полезно любому, кто ищет подобное приложение:

На что следует обратить внимание: 1. Функция "строки" не удаляла строку, в то время как значения обоих столбцов отсутствуют. 2. Функция среднего значения достаточно умна для вычисления среднего значения на основе доступной точки данных.

using System.Text;
using System.Threading.Tasks;
using Deedle;

namespace TestDeedleRowProcessWithMissingValues
{
    class Program
    {
        static void Main(string[] args)
        {
            var s1 = new SeriesBuilder<DateTime, double>(){
                 {DateTime.Today.Date.AddDays(-5),10.0},
                 {DateTime.Today.Date.AddDays(-4),9.0},
                 {DateTime.Today.Date.AddDays(-3),8.0},
                 {DateTime.Today.Date.AddDays(-2),double.NaN},
                 {DateTime.Today.Date.AddDays(-1),6.0},
                 {DateTime.Today.Date.AddDays(-0),5.0}
             }.Series;

            var s2 = new SeriesBuilder<DateTime, double>(){
                 {DateTime.Today.Date.AddDays(-5),10.0},
                 {DateTime.Today.Date.AddDays(-4),double.NaN},
                 {DateTime.Today.Date.AddDays(-3),8.0},
                 {DateTime.Today.Date.AddDays(-2),double.NaN},
                 {DateTime.Today.Date.AddDays(-1),6.0}                 
             }.Series;

            var f = Frame.FromColumns(new KeyValuePair<string, Series<DateTime, double>>[] { 
                KeyValue.Create("s1",s1),
                KeyValue.Create("s2",s2)
            });

            s1.Print();
            f.Print();


            f.Rows.Select(kvp => kvp.Value).Print();

//            29/05/2015 12:00:00 AM -> series [ s1 => 10; s2 => 10]
//            30/05/2015 12:00:00 AM -> series [ s1 => 9; s2 => <missing>]
//            31/05/2015 12:00:00 AM -> series [ s1 => 8; s2 => 8]
//            1/06/2015 12:00:00 AM  -> series [ s1 => <missing>; s2 => <missing>]
//            2/06/2015 12:00:00 AM  -> series [ s1 => 6; s2 => 6]
//            3/06/2015 12:00:00 AM  -> series [ s1 => 5; s2 => <missing>]


            f.Rows.Select(kvp => kvp.Value.As<double>().Mean()).Print();

//            29/05/2015 12:00:00 AM -> 10
//            30/05/2015 12:00:00 AM -> 9
//            31/05/2015 12:00:00 AM -> 8
//            1/06/2015 12:00:00 AM  -> <missing>
//            2/06/2015 12:00:00 AM  -> 6
//            3/06/2015 12:00:00 AM  -> 5


            //Console.ReadLine();
        }
    }
}

1 ответ

Вы можете отобразить все значения в вашем кадре, используя Frame.mapValues, Предоставьте ему функцию, которая принимает ваш тип данных и возвращает обновленное значение.

let indicator1 = [100.0;300.0;-200.0] |> Series.ofValues
let indicator2 = [200.0;500.0;1000.0] |> Series.ofValues

let frame = Frame.ofColumns ["indicator1" => indicator1; "indicator2" => indicator2]
// val frame : Frame<int,string> =
// 
//     indicator1 indicator2
// 0 -> 100        200       
// 1 -> 300        500       
// 2 -> -200       1000     

let update v =
  match v with
  |v when v<500.0 && v>0.0 -> v * 1.1
  |v when v<0.0 -> nan
  |v -> v

let newFrame = frame |> Frame.mapValues update
// val newFrame : Frame<int,string> =
//  
//      indicator1 indicator2
// 0 -> 110        220       
// 1 -> 330        500       
// 2 -> <missing>  1000 
Другие вопросы по тегам