Can the program that calls Spark be started directly in Java-jar mode?

I wrote a worldcount program for spark, which can be debugged in eclipse using local mode, or run through the maven packaged java-jar command:


 SparkConf sparkConf = new SparkConf().setAppName("JavaWordCount");
 sparkConf.setMaster("local");
 JavaSparkContext ctx = new JavaSparkContext(sparkConf);
 JavaRDD<String> lines = ctx.textFile("file:///c:/sparkTest.txt");

 JavaRDD<String> words = lines.flatMap(new FlatMapFunction<String, String>() {
   @Override
   public Iterable<String> call(String s) {
     return Arrays.asList(SPACE.split(s));
   }
 });
 
 System.out.println(words.count());

I thought that when I switched to standalone client mode, I just changed the second line to

 sparkConf.setMaster("spark://localhost:7077");

is fine, but I have read some articles that say that spark programs must be run using the spark-submit command or another Java program that contains sparklauncher, so direct Java-jar is not possible. Isn"t this contradictory to what I observed in local mode? Or is local mode a special case, and other standalone client modes, standalone cluster mode and yarn mode, can only be run with the spark-submit command? With the spark-submit command, it is impossible to debug in eclipse, which is very inconvenient.

Apr.03,2021

The

local mode is only for debugging and is stand-alone, so you don't need to do things like distributing the local configuration to the cluster.
in cluster mode, each submitter needs to distribute some local configurations and jar packages to the cluster.

I hope this article will help you
http://www.russellspitzer.com.

Menu