Skip to content
代码片段 群组 项目
提交 bf0727d1 编辑于 作者: openaiops's avatar openaiops
浏览文件

Initial commit

上级
分支
无相关合并请求
## Spark
Apache Spark (https://spark.apache.org) is a unified analytics engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing. Spark has been widely deployed in industry for big data processing.
The log set was collected by aggregating logs from the Spark system in our lab at CUHK, which comprises a total of 32 machines. The logs are aggregated at the machine level. However, three machines have been repaired and unfortunately some logs are lost. The logs have a huge size (over 2GB) and are provided as-is without further modification or labelling, which involve both normal and abnormal application runs.
### Download
The raw logs are available for downloading at https://github.com/logpai/loghub.
### Citation
If you use this dataset from loghub in your research, please cite the following papers.
+ Jieming Zhu, Shilin He, Pinjia He, Jinyang Liu, Michael R. Lyu. [Loghub: A Large Collection of System Log Datasets for AI-driven Log Analytics](https://arxiv.org/abs/2008.06448). IEEE International Symposium on Software Reliability Engineering (ISSRE), 2023.
此差异已折叠。
因为 它太大了无法显示 源差异 。您可以改为 查看blob
EventId,EventTemplate
E1,attempt_<*>: Committed
E2,"Block <*> stored as bytes in memory (estimated size <*>, free <*>)"
E3,"Block <*> stored as values in memory (estimated size <*>, free <*>)"
E4,Changing modify acls to: <*>
E5,Changing view acls to: <*>
E6,Connecting to driver: spark://<*>
E7,Created local directory at <*>
E8,File Output Committer Algorithm version is <*>
E9,Finished task <*> in stage <*> (TID <*>). <*> bytes result sent to driver
E10,Found block rdd_<*> locally
E11,Got assigned task <*>
E12,Input split: hdfs://<*>
E13,"mapred.job.id is deprecated. Instead, use mapreduce.job.id"
E14,"mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id"
E15,"mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap"
E16,"mapred.task.partition is deprecated. Instead, use mapreduce.task.partition"
E17,"mapred.tip.id is deprecated. Instead, use mapreduce.task.id"
E18,MemoryStore started with capacity <*> GB
E19,"Partition rdd_<*> not found, computing it"
E20,Reading broadcast variable <*> took <*> ms
E21,Registered BlockManager
E22,"Registered signal handlers for [TERM, HUP, INT]"
E23,Remoting started; listening on addresses :[akka.tcp://<*>]
E24,Running task <*> in stage <*> (TID <*>)
E25,Saved output of task 'attempt_<*>' to hdfs://<*>
E26,"SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, curi); users with modify permissions: Set(yarn, curi)"
E27,Server created on <*>
E28,Slf4jLogger started
E29,Started reading broadcast variable <*>
E30,Starting executor ID <*> on host <*>
E31,Starting remoting
E32,Successfully registered with driver
E33,Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port <*>.
E34,Successfully started service 'sparkExecutorActorSystem' on port <*>.
E35,"Times: total = <*>, boot = <*>, init = <*>, finish = <*>"
E36,Trying to register BlockManager
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册