Skip to content

Error occurred when using Etl to slice the grid tiff file in hdfs #3508

@KiktMa

Description

@KiktMa

An error occurred while using Etl from Geotrellis to build a pyramid model for raster data in hdfs and store it in the accumulo database

I am using geotrellis2.1.0Scala2.11.7hadoop2.7.7spark2.3.4jdk1.8

After I have written input.json, output.json, and backend-profiles.json, I use spark-submit to submit the task geotrellis. spark. etl. SinglebandIngest

./bin/spark-submit --class geotrellis.spark.etl.SinglebandIngest --master yarn /usr/local/app/spark/spark-2.3.4/jars/geotrellis-spark-etl_2.11-2.1.0.jar --input file:///app/tif/json/input.json --output file:///app/tif/json/output.json --backend-profiles file:///app/tif/json/backend-profiles.json

Error Reporting Results:

TaskSetManager:66 - Lost task 0.0 in stage 0.0 (TID 0, node1, executor 2): java.lang.NegativeArraySizeException
        atscala.reflect.ManifestFactory$$anon$6.newArray(Manifest.scala:93)
        at scala.reflect.ManifestFactory$$anon$6.newArray(Manifest.scala:91)
        at scala.Array$.ofDim(Array.scala:218)
        at geotrellis.raster.UByteArrayTile$.ofDim(UByteArrayTile.scala:239)
        at geotrellis.raster.UByteArrayTile$.empty(UByteArrayTile.scala:267)
        at geotrellis.raster.ArrayTile$.empty(ArrayTile.scala:431)
        at geotrellis.raster.io.geotiff.GeoTiffTile.mutable(GeoTiffTile.scala:698)
        at geotrellis.raster.io.geotiff.GeoTiffTile.toArrayTile(GeoTiffTile.scala:690)
        at geotrellis.spark.io.RasterReader$$anon$1.readFully(RasterReader.scala:67)
        at geotrellis.spark.io.RasterReader$$anon$1.readFully(RasterReader.scala:63)
        at geotrellis.spark.io.hadoop.HadoopGeoTiffRDD$$anonfun$apply$5$$anonfun$apply$6.apply(HadoopGeoTiffRDD.scala:148)
        at geotrellis.spark.io.hadoop.HadoopGeoTiffRDD$$anonfun$apply$5$$anonfun$apply$6.apply(HadoopGeoTiffRDD.scala:147)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
        at scala.collection.TraversableOnce$class.reduceLeft(TraversableOnce.scala:185)
        at scala.collection.AbstractIterator.reduceLeft(Iterator.scala:1336)
        at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1021)
        at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1019)
        at org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:2130)
        at org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:2130)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:109)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

My tiff file only has 180Mb. How can I solve this problem,I increased the driver memory to 2G, but I still couldn't resolve this error

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions