-
Notifications
You must be signed in to change notification settings - Fork 417
Closed
Description
Overview
I have several VideoFrames and AudioFrames that I want to save together to an .mp4 file. As of right now I am able to save the video frames to one file and the AudioFrames to a separate file. But I haven't found a way to store them in a joint .mp4 file, with audio and video.
Expected behavior
I want to store an audio stream and a video stream to the same .mp4 file.
Actual behavior
No error is thrown, but the resulting .mp4 file can not be opened and seems to be flawed.
Investigation
This results in a flawed .mp4 file:
import av
import numpy as np
video_tensor = np.zeros((250, 176, 320, 3))
audio_tensor = np.zeros((930, 2, 1024))
with av.open('save_to.mp4', mode='w', format='mp4') as container:
stream_audio = container.add_stream('mp3', rate=48000) # maybe the 'mp3' needs to be changed?
stream_video = container.add_stream('h264', rate=24)
stream_video.height = video_tensor.shape[-3]
stream_video.width = video_tensor.shape[-2]
# video encoding
for vid in video_tensor:
frame = av.VideoFrame.from_ndarray(vid, format='rgb24')
for packet in stream_video.encode(frame):
container.mux(packet)
for packet in stream_video.encode(None):
container.mux(packet)
# audio encoding
for i, audio in enumerate(audio_tensor):
frame = av.AudioFrame.from_ndarray(array=audio, format='fltp', layout='stereo')
frame.rate = 48000
frame.pts = 1024 * i
for packet in stream_audio.encode(frame):
container.mux(packet)
for packet in stream_audio.encode(None):
container.mux(packet)
Storing the audio frames and video frames seperately works fine.
For VideoFrames to .mp4 (without audio):
import av
import numpy as np
video_tensor = np.zeros((250, 176, 320, 3))
audio_tensor = np.zeros((930, 2, 1024))
with av.open('save_to.mp4', mode='w', format='mp4') as container:
stream_video = container.add_stream('h264', rate=24)
stream_video.height = video_tensor.shape[-3]
stream_video.width = video_tensor.shape[-2]
# video encoding
for vid in video_tensor:
frame = av.VideoFrame.from_ndarray(vid, format='rgb24')
for packet in stream_video.encode(frame):
container.mux(packet)
for packet in stream_video.encode(None):
container.mux(packet)For AudioFrames to .mp3:
import av
import numpy as np
video_tensor = np.zeros((250, 176, 320, 3))
audio_tensor = np.zeros((930, 2, 1024))
with av.open('save_to.mp3', mode='w', format='mp3') as container:
stream_audio = container.add_stream('mp3', rate=48000)
# audio encoding
for i, audio in enumerate(audio_tensor):
frame = av.AudioFrame.from_ndarray(array=audio, format='fltp', layout='stereo')
frame.rate = 48000
frame.pts = 1024 * i
for packet in stream_audio.encode(frame):
container.mux(packet)
for packet in stream_audio.encode(None):
container.mux(packet)
Research
I have done the following:
- Checked the PyAV documentation
- Searched on Google
- Searched on Stack Overflow
- Looked through old GitHub issues
- Asked on PyAV Gitter
- ... and waited 72 hours for a response.
Metadata
Metadata
Assignees
Labels
No labels