Giriş
Şeklen şöyle. Apache Beam farklı dillerde kodlanabilir ve farklı Runner'lar kullanabilir
Şöyle yaparız
implementation("org.apache.beam:beam-sdks-java-core:2.45.0")runtimeOnly("org.apache.beam:beam-runners-direct-java:2.45.0")
Örnek
Şöyle yaparız
public class App { public static void main(String[] args) { PipelineOptions options = PipelineOptionsFactory.create(); // Create pipeline Pipeline p = Pipeline.create(options); // Read text data from Sample.txt PCollection<String> textData = p.apply(TextIO.read().from("Sample.txt")); // Write to the output file with wordcounts as a prefix textData.apply(TextIO.write().to("wordcounts")); // Run the pipeline p.run().waitUntilFinish(); } }
1. Create a PipelineOption.
2. Create a Pipeline with the option.
3. Add the logic to read data from Sample.txt to the pipeline and get the return value as PCollection, which is an abstraction of dataset in Apache Beam.
4. Add another step to write the return value in the previous step to output file with name starting with wordcounts.
5. Lastly, run and finish the pipeline.
Hiç yorum yok:
Yorum Gönder