Data Memes
484 subscribers
574 photos
9 videos
2 files
63 links
All best data memes in one place!

https://surfalytics.com πŸ„β€β™€οΈ
Download Telegram
How to succeed in corporate world? This is an advice - use magic loop.

The Magic Loop consists of five steps. Some of these steps will require substantial time and effort, but they lead to not just rapid career growth but also a great relationship with your manager.

The steps:
1. Do your current job well
2. Ask your manager how you can help them
3. Do what they ask
4. Ask your manager if you could help in a way that also grows your skills toward a particular goal
5. Do as they suggest, and repeat in a loop from step 4
❀4πŸ‘2
Salaries in Canada and US are different. US pay better for sure. And cost of living is better.
πŸ”₯2😱2πŸ€”1
At work or during the interview, we so obsesses with Use Cases. Why we never thought to add this into the CV/Resume? It is exactly what business is looking when they want to hire a professional to get shit done.
❀‍πŸ”₯7🐳2πŸ€”1
Have a great weekend! Don't forget to use weekend wise - learning and sharpening your data skills!


πŸ” πŸ” πŸ” πŸ” πŸ…°οΈπŸ” πŸ” πŸ” πŸ” πŸ” πŸ” 
Please open Telegram to view this post
VIEW IN TELEGRAM
🫑11πŸ€”2🀑2🌚1πŸ’―1
We start resumes from the wrong perspective:

What have I done, write that down.

Instead, we need to think, "who is going to read this, what are they going to be looking for, how do I show that to them?"

It is entirely natural to start from the literal perspective of "What have I done." That is what we are taught should go in a resume. The term for a resume outside of the US is CV, Curriculum Vitae, meaning "course of (one's) life."

This is great, except that if I am a manager or recruiter evaluating potential hires, I am not thinking about the course of their lives; I am asking myself, can this person do this job? Are they going to fit our culture and solve my problems?"

Break out of the mold of trying to tell your life story; instead, help the reader understand how you are the best solution to their problem, the best asset to their company.

This means changing how you present your experience from "these are things I have done" to "here are examples of problems I have solved, capabilities I have developed, and impact I have had; I can do even more for you!"

Which reminds me (double value post today!): SHOW ME, do not TELL ME.

People like to write that they are "Highly motivated quick learners with excellent communication skills."

I call these "happy words," where people write unsupported positive things about themselves at the top of the resume. Happy words are a distraction, for a few reasons:

1) Everyone writes them
2) They are all mostly the same; no one ever writes "I'm lazy and a bit hard to get along with, but I have unique skills so hire me and put up with my drama, it will be worth it." At least that would stand out for consideration!
3) They are not backed up by data.

Good managers and recruiters skip right over your happy words to look at what you have DONE. What you have done SHOWS ME what you are really capable of. Because you have done it. So if you feel you are a quick learner, show that through an example of how you learned a new space in two weeks, allowing you to deliver something fast. If you think you have excellent communication skills, link me to your blog or your list of public speaking events.

Example: A candidate I was trying to hire once asked me about how I lead my team. I told him that I was a live streamer with 100 videos on YouTube, all of which my team could see, where I talked about my leadership style. I pointed out that if I were a hypocrite in those videos and led differently than what I said, my team would call me out on it. He said this was the best answer he ever heard because it contained proof, not just my claims
.

Source: https://www.linkedin.com/posts/ethanevansvp_we-start-resumes-from-the-wrong-perspective-activity-7111743587922452480-je7L
❀4✍1
If you want to secure a long-lasting and successful career in data engineering, it's not enough to merely acquire proficiency in using a particular tool. Instead, you should strive to understand its inner workings at a fundamental level as well as the first principles that underly it.

"But how?" you may ask. "There are so many tools out there: Spark, Trino, BigQuery, Snowflake, etc."

That's certainly true. However, it might surprise you to discover that all of these systems share strikingly similar foundations.

For instance, they all depend on some variation of the MapReduce model to process data. They all require data shuffling between nodes for tasks like joining or grouping. They all rely on column-oriented data formats. They are all susceptible to issues such as skewed keys, the small object problem, uneven partitioning, and so on.

The key, naturally, lies in the details. What sets each tool apart is a unique set of trade-offs that its developers have chosen to make it particularly suited to address specific use cases. For instance, Trino terminates queries that exceed memory limits to prevent costly disk spills, as one of its primary objectives is low latency. Snowflake automatically handles data partitioning for a more user-friendly experience but relinquishes fine-grained control from end users. Spark offers maximum user control but may come across as a more complex tool, and so forth.

Nonetheless, if you dig into how these tools move data around you'll discover that they are not that different after all. Plus their functionalities continue to overlap and converge over time.

Therefore, my advice is to run an 'EXPLAIN' or equivalent command for every query you write and invest time in understanding the resulting output. Ensure you grasp how each part of your query maps to a specific stage within a physical plan. Use this knowledge to debug your queries. I can assure you that the expertise and experience acquired this way will be transferable to other similar tools or data warehouse vendors.

Individual tools may come and go at a rapid pace, but fundamental principles endure and change far less frequently.

Source https://www.linkedin.com/posts/izeigerman_dataengineering-activity-7110648980732080128-DKmn
🐳1
In Analytics you should always know what to measure and why, at least 2-3 metrics for your business domain.

PS avoid vanity metrics 🀨
Please open Telegram to view this post
VIEW IN TELEGRAM
🐳5❀1πŸ”₯1
Best advice ever - wanna rise? Go get it with new company. Don’t listen corporate bullshit and your manager, take the action, your grow and income are on you.
🫑8πŸ’―3πŸ€—2
Ideal world where no low ball and everyone is honest upfront.
πŸ”₯2✍1🍾1
Anything stops you from success?!πŸ–
Please open Telegram to view this post
VIEW IN TELEGRAM
🍾1
🐳6❀3πŸ”₯1
Is university degree important for data jobs? Not at all. No one cares what degree you have. Skills are more important.

Today I talked with colleague, who paid 50k in 3rd tier US university for 1 year of Masters in Business Analytics + cost of living for 1 year. Overall 80k money waste. Yes she got the job and some skills but in what cost. With the right focus and content, she would "fake it and make it" in 4-5 months. Imagine degree for 2 years and 1st or 2nd tier university with cost of living🫨
πŸ’―5✍1
🌟 Parquet:
Advantages: Columnar, compressed, schema evolution support!
Disadvantages: Not for write-heavy workloads.
Use Cases: Analytical querying & data warehousing.

🌟 Avro:
Advantages: Row-based, schema evolution, efficient serialization.
Disadvantages: Slower for analytical queries.
Use Cases: Data serialization & data interchange.

🌟 JSON:
Advantages: Human-readable & schema flexible.
Disadvantages: Inefficient storage.
Use Cases: Web data interchange & configuration.

🌟 DeltaLake:
Advantages: ACID Transactions, schema enforcement.
Disadvantages: Proprietary.
Use Cases: ACID transactions & schema enforcement in Data Lakes.

πŸš€ Tips for Maximizing Benefits in
#Spark:
- Choosing Format: Select data format based on read-write patterns, query performance, and storage efficiency.

- Partitioning: Properly partition data to optimize read performance, especially for large datasets.

- Compression: Choose an appropriate compression codec considering the trade-off between storage space and CPU usage.

- Caching: Leverage Spark’s caching features for frequently accessed datasets.

- Schema Evolution: Design schemas thoughtfully to allow for evolution over time without causing data inconsistency or requiring expensive migrations.
❀‍πŸ”₯5❀1
❀1
Only remote!
πŸ’―9❀1
From rockyourdata.cloud: "We have awesome news! We launched education programs for Data Engineer, Data Analysts and BI engineers positions. We are going to utilize years of experience into our curriculum and help people move to data industry and land first job"

Rock Your Data is North America consulting company with focus on Cloud Analytics.

Please share https://www.linkedin.com/posts/rock-your-data_dataengineer-dataanalyst-biengineer-activity-7118664122300321792-mXWV
πŸ”₯8❀2
This media is not supported in your browser
VIEW IN TELEGRAM
Enjoy time at workπŸ€™
πŸ”₯6❀‍πŸ”₯2🫑1