Файл PCM Wave - от стерео до моно
У меня есть аудио файл, который является стерео. Является ли преобразование его в моно просто пропуском каждого второго байта (после заголовка)? Он закодирован в 16-битном формате PCM со знаком. у меня есть javax.sound.sampled
имеется в наличии.
Вот код, который я пробовал, который не работал:
WaveFileWriter wfw = new WaveFileWriter();
AudioFormat format = new AudioFormat(Encoding.PCM_SIGNED, 44100, 16, 2, 2, 44100, false);
AudioFormat monoFormat = new AudioFormat(Encoding.PCM_SIGNED, 44100, 16, 1, 2, 44100, false);
byte[] audioData = dataout.toByteArray();
int length = audioData.length;
ByteArrayInputStream bais = new ByteArrayInputStream(audioData);
AudioInputStream stereoStream = new AudioInputStream(bais,format,length);
AudioInputStream monoStream = new AudioInputStream(stereoStream,format,length/2);
wfw.write(monoStream, Type.WAVE, new File(Environment.
getExternalStorageDirectory().getAbsolutePath()+"/stegDroid/un-ogged.wav"));
Этот код используется после прочтения .ogg
файл с помощью Jorbis для преобразования его в данные PCM. Единственная проблема - результат стерео, и мне нужно, чтобы он был моно, поэтому, если есть другое решение, я буду рад его услышать!
3 ответа
У меня есть аудио файл, который является стерео. Является ли преобразование его в моно просто пропуском каждого второго байта (после заголовка)?
Почти - вы хотите пропустить все остальные семплы, а не байты. В вашем случае это выглядит так, как будто каждый образец имеет размер 16 бит = 2 байта. Таким образом, вы хотели бы взять 2 байта, пропустить 2 байта, взять 2 байта и так далее.
AudioInputStream monoStream = new AudioInputStream(stereoStream,format,length/2);
wfw.write(monoStream, Type.WAVE, new File(Environment.getExternalStorageDirectory().getAbsolutePath()+"/stegDroid/un-ogged.wav"));
Похоже, вы просто записываете первую половину файла, а не записываете все остальные образцы. Также вы должны исправить заголовок WAV, чтобы указать один канал (см. Ваш monoFormat
).
Посмотрите на этот код. Это помогло мне, когда мне нужно было возиться с байтами в файле WAV.
package GlobalUtilities;
import java.applet.Applet;
import java.applet.AudioClip;
import java.net.URISyntaxException;
import java.util.logging.Level;
import java.util.logging.Logger;
import java.io.*;
import java.io.File;
import java.net.MalformedURLException;
import java.net.URL;
import javax.sound.sampled.*;
/**
* This class handles the reading, writing, and playing of wav files. It is
* also capable of converting the file to its raw byte [] form.
*
* based on code by Evan Merz modified by Dan Vargo
* @author dvargo
*/
public class Wav {
/*
WAV File Specification
FROM http://ccrma.stanford.edu/courses/422/projects/WaveFormat/
The canonical WAVE format starts with the RIFF header:
0 4 ChunkID Contains the letters "RIFF" in ASCII form
(0x52494646 big-endian form).
4 4 ChunkSize 36 + SubChunk2Size, or more precisely:
4 + (8 + SubChunk1Size) + (8 + SubChunk2Size)
This is the size of the rest of the chunk
following this number. This is the size of the
entire file in bytes minus 8 bytes for the
two fields not included in this count:
ChunkID and ChunkSize.
8 4 Format Contains the letters "WAVE"
(0x57415645 big-endian form).
The "WAVE" format consists of two subchunks: "fmt " and "data":
The "fmt " subchunk describes the sound data's format:
12 4 Subchunk1ID Contains the letters "fmt "
(0x666d7420 big-endian form).
16 4 Subchunk1Size 16 for PCM. This is the size of the
rest of the Subchunk which follows this number.
20 2 AudioFormat PCM = 1 (i.e. Linear quantization)
Values other than 1 indicate some
form of compression.
22 2 NumChannels Mono = 1, Stereo = 2, etc.
24 4 SampleRate 8000, 44100, etc.
28 4 ByteRate == SampleRate * NumChannels * BitsPerSample/8
32 2 BlockAlign == NumChannels * BitsPerSample/8
The number of bytes for one sample including
all channels. I wonder what happens when
this number isn't an integer?
34 2 BitsPerSample 8 bits = 8, 16 bits = 16, etc.
The "data" subchunk contains the size of the data and the actual sound:
36 4 Subchunk2ID Contains the letters "data"
(0x64617461 big-endian form).
40 4 Subchunk2Size == NumSamples * NumChannels * BitsPerSample/8
This is the number of bytes in the data.
You can also think of this as the size
of the read of the subchunk following this
number.
44 * Data The actual sound data.
The thing that makes reading wav files tricky is that java has no unsigned types. This means that the
binary data can't just be read and cast appropriately. Also, we have to use larger types
than are normally necessary.
In many languages including java, an integer is represented by 4 bytes. The issue here is
that in most languages, integers can be signed or unsigned, and in wav files the integers
are unsigned. So, to make sure that we can store the proper values, we have to use longs
to hold integers, and integers to hold shorts.
Then, we have to convert back when we want to save our wav data.
It's complicated, but ultimately, it just results in a few extra functions at the bottom of
this file. Once you understand the issue, there is no reason to pay any more attention
to it.
ALSO:
This code won't read ALL wav files. This does not use to full specification. It just uses
a trimmed down version that most wav files adhere to.
*/
ByteArrayOutputStream byteArrayOutputStream;
AudioFormat audioFormat;
TargetDataLine targetDataLine;
AudioInputStream audioInputStream;
SourceDataLine sourceDataLine;
float frequency = 8000.0F; //8000,11025,16000,22050,44100
int samplesize = 16;
private String myPath;
private long myChunkSize;
private long mySubChunk1Size;
private int myFormat;
private long myChannels;
private long mySampleRate;
private long myByteRate;
private int myBlockAlign;
private int myBitsPerSample;
private long myDataSize;
// I made this public so that you can toss whatever you want in here
// maybe a recorded buffer, maybe just whatever you want
public byte[] myData;
public Wav()
{
myPath = "";
}
// constructor takes a wav path
public Wav(String tmpPath) {
myPath = tmpPath;
}
// get/set for the Path property
public String getPath()
{
return myPath;
}
public void setPath(String newPath)
{
myPath = newPath;
}
// read a wav file into this class
public boolean read() {
DataInputStream inFile = null;
myData = null;
byte[] tmpLong = new byte[4];
byte[] tmpInt = new byte[2];
try {
inFile = new DataInputStream(new FileInputStream(myPath));
//System.out.println("Reading wav file...\n"); // for debugging only
String chunkID = "" + (char) inFile.readByte() + (char) inFile.readByte() + (char) inFile.readByte() + (char) inFile.readByte();
inFile.read(tmpLong); // read the ChunkSize
myChunkSize = byteArrayToLong(tmpLong);
String format = "" + (char) inFile.readByte() + (char) inFile.readByte() + (char) inFile.readByte() + (char) inFile.readByte();
// print what we've read so far
//System.out.println("chunkID:" + chunkID + " chunk1Size:" + myChunkSize + " format:" + format); // for debugging only
String subChunk1ID = "" + (char) inFile.readByte() + (char) inFile.readByte() + (char) inFile.readByte() + (char) inFile.readByte();
inFile.read(tmpLong); // read the SubChunk1Size
mySubChunk1Size = byteArrayToLong(tmpLong);
inFile.read(tmpInt); // read the audio format. This should be 1 for PCM
myFormat = byteArrayToInt(tmpInt);
inFile.read(tmpInt); // read the # of channels (1 or 2)
myChannels = byteArrayToInt(tmpInt);
inFile.read(tmpLong); // read the samplerate
mySampleRate = byteArrayToLong(tmpLong);
inFile.read(tmpLong); // read the byterate
myByteRate = byteArrayToLong(tmpLong);
inFile.read(tmpInt); // read the blockalign
myBlockAlign = byteArrayToInt(tmpInt);
inFile.read(tmpInt); // read the bitspersample
myBitsPerSample = byteArrayToInt(tmpInt);
// print what we've read so far
//System.out.println("SubChunk1ID:" + subChunk1ID + " SubChunk1Size:" + mySubChunk1Size + " AudioFormat:" + myFormat + " Channels:" + myChannels + " SampleRate:" + mySampleRate);
// read the data chunk header - reading this IS necessary, because not all wav files will have the data chunk here - for now, we're just assuming that the data chunk is here
String dataChunkID = "" + (char) inFile.readByte() + (char) inFile.readByte() + (char) inFile.readByte() + (char) inFile.readByte();
inFile.read(tmpLong); // read the size of the data
myDataSize = byteArrayToLong(tmpLong);
// read the data chunk
myData = new byte[(int) myDataSize];
inFile.read(myData);
// close the input stream
inFile.close();
} catch (Exception e) {
return false;
}
return true; // this should probably be something more descriptive
}
// write out the wav file
public boolean save() {
try {
DataOutputStream outFile = new DataOutputStream(new FileOutputStream(myPath + "temp"));
// write the wav file per the wav file format
outFile.writeBytes("RIFF"); // 00 - RIFF
outFile.write(intToByteArray((int) myChunkSize), 0, 4); // 04 - how big is the rest of this file?
outFile.writeBytes("WAVE"); // 08 - WAVE
outFile.writeBytes("fmt "); // 12 - fmt
outFile.write(intToByteArray((int) mySubChunk1Size), 0, 4); // 16 - size of this chunk
outFile.write(shortToByteArray((short) myFormat), 0, 2); // 20 - what is the audio format? 1 for PCM = Pulse Code Modulation
outFile.write(shortToByteArray((short) myChannels), 0, 2); // 22 - mono or stereo? 1 or 2? (or 5 or ???)
outFile.write(intToByteArray((int) mySampleRate), 0, 4); // 24 - samples per second (numbers per second)
outFile.write(intToByteArray((int) myByteRate), 0, 4); // 28 - bytes per second
outFile.write(shortToByteArray((short) myBlockAlign), 0, 2); // 32 - # of bytes in one sample, for all channels
outFile.write(shortToByteArray((short) myBitsPerSample), 0, 2); // 34 - how many bits in a sample(number)? usually 16 or 24
outFile.writeBytes("data"); // 36 - data
outFile.write(intToByteArray((int) myDataSize), 0, 4); // 40 - how big is this data chunk
outFile.write(myData); // 44 - the actual data itself - just a long string of numbers
} catch (Exception e) {
System.out.println(e.getMessage());
return false;
}
return true;
}
// return a printable summary of the wav file
public String getSummary() {
//String newline = System.getProperty("line.separator");
String newline = "
";
String summary = "Format: " + myFormat + newline + "Channels: " + myChannels + newline + "SampleRate: " + mySampleRate + newline + "ByteRate: " + myByteRate + newline + "BlockAlign: " + myBlockAlign + newline + "BitsPerSample: " + myBitsPerSample + newline + "DataSize: " + myDataSize + "";
return summary;
}
public byte[] getBytes() {
read();
return myData;
}
/**
* Plays back audio stored in the byte array using an audio format given by
* freq, sample rate, ect.
* @param data The byte array to play
*/
public void playAudio(byte[] data) {
try {
byte audioData[] = data;
//Get an input stream on the byte array containing the data
InputStream byteArrayInputStream = new ByteArrayInputStream(audioData);
AudioFormat audioFormat = getAudioFormat();
audioInputStream = new AudioInputStream(byteArrayInputStream, audioFormat, audioData.length / audioFormat.getFrameSize());
DataLine.Info dataLineInfo = new DataLine.Info(SourceDataLine.class, audioFormat);
sourceDataLine = (SourceDataLine) AudioSystem.getLine(dataLineInfo);
sourceDataLine.open(audioFormat);
sourceDataLine.start();
//Create a thread to play back the data and start it running. It will run \
//until all the data has been played back.
Thread playThread = new Thread(new PlayThread());
playThread.start();
} catch (Exception e) {
System.out.println(e);
}
}
/**
* This method creates and returns an AudioFormat object for a given set
* of format parameters. If these parameters don't work well for
* you, try some of the other allowable parameter values, which
* are shown in comments following the declarations.
* @return
*/
private AudioFormat getAudioFormat() {
float sampleRate = frequency;
//8000,11025,16000,22050,44100
int sampleSizeInBits = samplesize;
//8,16
int channels = 1;
//1,2
boolean signed = true;
//true,false
boolean bigEndian = false;
//true,false
//return new AudioFormat( AudioFormat.Encoding.PCM_SIGNED, 8000.0f, 8, 1, 1,
//8000.0f, false );
return new AudioFormat(sampleRate, sampleSizeInBits, channels, signed, bigEndian);
}
public void playWav(String filePath) {
try {
AudioClip clip = (AudioClip) Applet.newAudioClip(new File(filePath).toURI().toURL());
clip.play();
} catch (Exception e) {
Logger.getLogger(Wav.class.getName()).log(Level.SEVERE, null, e);
}
}
// ===========================
// CONVERT BYTES TO JAVA TYPES
// ===========================
// these two routines convert a byte array to a unsigned short
public static int byteArrayToInt(byte[] b) {
int start = 0;
int low = b[start] & 0xff;
int high = b[start + 1] & 0xff;
return (int) (high > 8) & 0x000000FF);
b[2] = (byte) ((i >> 16) & 0x000000FF);
b[3] = (byte) ((i >> 24) & 0x000000FF);
return b;
}
// convert a short to a byte array
public static byte[] shortToByteArray(short data) {
return new byte[]{(byte) (data & 0xff), (byte) ((data >>> 8) & 0xff)};
}
/**
* Inner class to play back the data that was saved
*/
class PlayThread extends Thread {
byte tempBuffer[] = new byte[10000];
public void run() {
try {
int cnt;
//Keep looping until the input
// read method returns -1 for
// empty stream.
while ((cnt = audioInputStream.read(tempBuffer, 0, tempBuffer.length)) != -1) {
if (cnt > 0) {
//Write data to the internal
// buffer of the data line
// where it will be delivered
// to the speaker.
sourceDataLine.write(tempBuffer, 0, cnt);
}
}
//Block and wait for internal
// buffer of the data line to
// empty.
sourceDataLine.drain();
sourceDataLine.close();
} catch (Exception e) {
System.out.println(e);
System.exit(0);
}
}
}
}
Отвечая на это в 2018 году. У меня похожая ситуация и я осознал явную ошибку, которую совершил. Ваши параметры "format" в аргументе конструктора неверны.
AudioFormat format = new AudioFormat(Encoding.PCM_SIGNED, 44100, 16, 2, 2, 44100,
false);
Пятый параметр (в вашем случае второй "2") представляет размер кадра. Размер кадра = Размер выборки * Каналы. Поскольку ваша битовая глубина равна 16, ваш размер выборки составляет 2 байта.
Размер выборки = 2
Каналы = 2
Размер кадра = Размер образца * Каналы = 4
Итак, ваша строка кода должна читать
AudioFormat format = new AudioFormat(Encoding.PCM_SIGNED, 44100, 16, 2, 4, 44100,
false);
Кроме того, вы пытались использовать FormatConversionProvider?
javax.sound.sampled.spi.FormatConversionProvider
https://docs.oracle.com/javase/tutorial/sound/converters.html Это руководство очень помогло мне, но я полагаю, что оно предполагает, что вы уже импортировали вышеупомянутый класс.
Я не видел этих решений, размещенных в этой теме, но, возможно, вы уже поняли это. В любом случае, надеюсь, это поможет!