how to read from HDFS multiple parquet files with spark.index.create .mode("overwrite").indexBy($"cellid").parquet

![parquet_issue](https://user-images.githubusercontent.com/38807419/69949856-8d17a680-14f2-11ea-97eb-d9f070fdba30.JPG)
I have built from the master code, 
then successfully import the jar file in Jupyter Notebook(%AddJar file:/srv/home/srv-taurus-stage/sbd2-notebook/jupyter/parquet-index_2.11-0.4.1-SNAPSHOT.jar)
then I added the library too  *import com.github.lightcopy.implicits._)

However , when I try to create the index by providing the path to the parquet files  on HDFS          
spark.index.create
                 .mode("overwrite").indexBy($"cellid")
                 .parquet("hdfs://///data/taurus/stage/taurus.stage.counter-lte-eri-cell-raw-parquet/time=ingestion/bucket=hourly/date=2019-11-2*/*") that I checked one cell above for existence it fails:
Message: File does not exist: How the parquet method knows to read from hdfs, it looks like it does not like the context path hdfs:////


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

how to read from HDFS multiple parquet files with spark.index.create .mode("overwrite").indexBy($"cellid").parquet #95

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

how to read from HDFS multiple parquet files with spark.index.create .mode("overwrite").indexBy($"cellid").parquet #95

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions