VMware Tanzu Greenplum

 Not able to read from GreenPlum using Spark Connector

jithender boreddy's profile image
jithender boreddy posted Mar 22, 2020 10:25 AM

Please someone could help me to get out of this issue.

I am trying to read from Greenplum using GreenPlum-Spark connector. I used jar greenplum-spark_2.11-1.5.0.jar which I downloaded from https://network.pivotal.io/products/pivotal-gpdb/

I am trying to access greenplum from spark-shell and imported jar like below

C:\spark-shell --jars C:\jars\greenplum-spark_2.11-1.6.2.jar

 

scala>val gscReadOptionMap = Map(

"url" -> "jdbc:postgresql://server-ip:5432/db_name",

"user" -> "user_id",

"password" -> "pwd",

"dbschema" -> "schema_name",

"dbtable" -> "table_name",

"driver" -> "org.postgresql.Driver"

)

 

scala>val gpdf = spark.read.format("greenplum").options(gscReadOptionMap).load()

(or)

scala>val gpdf = spark.read.format("io.pivotal.greenplum.spark.GreenplumRelationProvider").options(gscReadOptionMap).load()

 

Resuting in below error: java.lang.IllegalArgumentException: '' does not exist in "schema_name"."table_name" table at io.pivotal.greenplum.spark.GreenplumRelationProvider.createRelation(GreenplumRelationProvider.scala:50) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167) ... 49 elided

 

Oliver Albertini's profile image
Oliver Albertini

Looks like this was already addressed here. Thanks for the question.