
DataSet: batch
DataStream: streaming
Basic parts of a Flink program:
- Obtain an execution environment,
- Load/create the initial data,
- Specify transformations on this data,
- Specify where to put the results of your computations,
- Trigger the program execution
Lazy evaluation: The operations are actually executed when the execution is explicitly triggered by an execute() call on the execution environment.
Some transformations (join, coGroup, keyBy, groupBy) require a key.
Other transformations (Reduce, GroupReduce, Aggregate, Windows) allow data being grouped on a key before they are applied.
Keys are “virtual”: they are defined as functions over the actual data to guide the grouping operator.




近期评论