Hello, my name is Wataru Yukawa. I work at LINE as a data engineer.
As a data engineer, my daily duties include using Fluentd to collect logs, Hadoop to accumulate, and Hive to aggregate and analyze logs. Our Hadoop cluster is medium-sized, consisting of 40 units and approximately 370TB of DFS used space. Data from LINE family apps is smaller compared to the LINE app. While it’s nowhere near large enough to be considered as big data, it still has many types of different data, Fluentd tags, and over 400 Fluentd processes due to the various LINE family services tied to it. The Fluentd data flow amounts to 150 thousand messages per second during peak times.