Machine Learning
39.2K subscribers
3.83K photos
32 videos
41 files
1.3K links
Machine learning insights, practical tutorials, and clear explanations for beginners and aspiring data scientists. Follow the channel for models, algorithms, coding guides, and real-world ML applications.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
### Hugging Face Transformers: Unlock the Power of Open-Source AI in Python

Discover the limitless potential of Hugging Face Transformers, a robust Python library that empowers developers and data scientists to harness thousands of pretrained, open-source AI models. These state-of-the-art models are designed for a wide array of tasks across various modalities, including natural language processing (NLP), computer vision, audio processing, and multimodal learning.

#### Why Choose Hugging Face Transformers?

1. Cost Efficiency: Utilizing pretrained models significantly reduces costs associated with developing custom AI solutions from scratch.
2. Time Savings: Save valuable time by leveraging pre-trained models, allowing you to focus on fine-tuning and deploying your applications faster.
3. Control and Customization: Gain greater control over your AI deployments, enabling you to tailor models to meet specific project requirements and achieve optimal performance.

#### Versatile Applications

Whether you're working on text classification, sentiment analysis, image recognition, speech-to-text conversion, or any other AI-driven task, Hugging Face Transformers provides the tools you need to succeed. The library's extensive collection of models ensures that you have access to cutting-edge technology without the need for extensive training resources.

#### Get Started Today!

Dive into the world of open-source AI with Hugging Face Transformers. Explore detailed tutorials and practical examples at:
https://realpython.com/huggingface-transformers/

to enhance your skills and unlock new possibilities in your projects. Join our community on Telegram (@DataScienceM) for continuous learning and support.

🧠 #HuggingFaceTransformers #OpenSourceAI #PretrainedModels #NaturalLanguageProcessing #ComputerVision #AudioProcessing #MultimodalLearning #AIDevelopment #PythonLibrary #DataScienceCommunity
Please open Telegram to view this post
VIEW IN TELEGRAM
👍10🔥21
• Get raw audio data as a NumPy array.
import numpy as np
samples = np.array(audio.get_array_of_samples())

• Create a Pydub segment from a NumPy array.
new_audio = AudioSegment(
samples.tobytes(),
frame_rate=audio.frame_rate,
sample_width=audio.sample_width,
channels=audio.channels
)

• Read a WAV file directly into a NumPy array.
from scipy.io.wavfile import read
rate, data = read("sound.wav")

• Write a NumPy array to a WAV file.
from scipy.io.wavfile import write
write("new_sound.wav", rate, data)

• Generate a sine wave.
import numpy as np
sample_rate = 44100
frequency = 440 # A4 note
duration = 5
t = np.linspace(0., duration, int(sample_rate * duration))
amplitude = np.iinfo(np.int16).max * 0.5
data = amplitude * np.sin(2. * np.pi * frequency * t)
# This array can now be written to a file


VIII. Audio Analysis with Librosa

• Load audio with Librosa.
import librosa
y, sr = librosa.load("sound.mp3")

• Estimate tempo (Beats Per Minute).
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)

• Get beat event times in seconds.
beat_times = librosa.frames_to_time(beat_frames, sr=sr)

• Decompose into harmonic and percussive components.
y_harmonic, y_percussive = librosa.effects.hpss(y)

• Compute a spectrogram.
import numpy as np
D = librosa.stft(y)
S_db = librosa.amplitude_to_db(np.abs(D), ref=np.max)

• Compute Mel-Frequency Cepstral Coefficients (MFCCs).
mfccs = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13)

• Compute Chroma features (related to musical pitch).
chroma = librosa.feature.chroma_stft(y=y, sr=sr)

• Detect onset events (the start of notes).
onset_frames = librosa.onset.onset_detect(y=y, sr=sr)
onset_times = librosa.frames_to_time(onset_frames, sr=sr)

• Pitch shifting.
y_pitched = librosa.effects.pitch_shift(y, sr=sr, n_steps=4) # Shift up 4 semitones

• Time stretching (change speed without changing pitch).
y_fast = librosa.effects.time_stretch(y, rate=2.0) # Double speed


IX. More Utilities

• Detect leading silence.
from pydub.silence import detect_leading_silence
trim_ms = detect_leading_silence(audio)
trimmed_audio = audio[trim_ms:]

• Get the root mean square (RMS) energy.
rms = audio.rms

• Get the maximum possible RMS for the audio format.
max_possible_rms = audio.max_possible_amplitude

• Find the loudest section of an audio file.
from pydub.scipy_effects import normalize
loudest_part = normalize(audio.strip_silence(silence_len=1000, silence_thresh=-32))

• Change the frame rate (resample).
resampled = audio.set_frame_rate(16000)

• Create a simple band-pass filter.
from pydub.scipy_effects import band_pass_filter
filtered = band_pass_filter(audio, 400, 2000) # Pass between 400Hz and 2000Hz

• Convert file format in one line.
AudioSegment.from_file("music.ogg").export("music.mp3", format="mp3")

• Get the raw bytes of the audio data.
raw_data = audio.raw_data

• Get the maximum amplitude.
max_amp = audio.max

• Match the volume of two segments.
matched_audio2 = audio2.apply_gain(audio1.dBFS - audio2.dBFS)


#Python #AudioProcessing #Pydub #Librosa #SignalProcessing

━━━━━━━━━━━━━━━
By: @DataScienceM
3