Col_1 Col_2 Col_3
1 1a 10.1
1 1b 10
1 1c 10.8
1 1d 10.6
2 2a 11
2 2b 9.8
2 2c 10
Col_1 is associated with Col_2.
When I query this for individual record:
val A = [login to view URL](“table_name”).filter($”Col_1” ==== “1”).select(“Col_2”).[login to view URL]()
I get the output as:
Col_2
1a
1b
1c
1d
Now, I want to run this as a loop. Taking values from Col_1 and giving the output associated with Col_2.
Then I want to do pearson correlation on each dataframe from Col_2 and Col_3.
Eg: Correlation between
Col_2 Col_3
1a 10.1
1b 10
1c 10.8
1d 10.6
Then next between:
Col_2 Col_3
2a 10.1
2b 10
2c 10.8
And this will go on in a loop.
I want to see the output for each correlation less than 0.4 in a separate table
Col_2 Correalation
1_d 0.3
2_a 0.1
I want the above as one method.
This needs to be done in Spark Scala.
Expert in Spark Scala code. Ideally Spark database code should not be run as a loop, as Vectorized code is much faster. I can vectorize this for you and have it done in an afternoon, as well as any other code you may need done.
$30 USD på 1 dag
0,0 (0 anmeldelser)
0,0
0,0
2 freelancere byder i gennemsnit $75 USD på dette job