2,341
2.3K
Feb 27, 2021
02/21
by
Sound & Vision
audio
eye 2,341
favorite 13
comment 0
Info: By: https://synthmania.com/gigapack.htm https://www.youtube.com/watch?v=mnFnQ507T9k GIGAPACK *Sample CD-ROM library in Roland S-770 and compatible format This is an excellent third-party library. Gigapack comes in 2 CD-ROMs, and they complement each other very well. CD1: The first CD-ROM consists entirely of drums and loops - there are literally hundreds to choose from! Many samples of classic kits and drum machines are also provided, so it's possible to make your own...
Topics: Roland, Sound&vision, gigapack, samplecd, CDrom, 90s, vgm, sampler, Roland S-770, bgm, sfx,...
This report describes the design and development of a real-time adaptive transform coder that transmits high-quality speech over a 9600 bps channel with bit-error rates of up to 1% without significant loss of speech fidelity. The report presents the results of our FORTRAN simulations on the adaptive transform coder which maximized the quality of the transmitted speech. Important aspects of the ATC algorithm which are optimized were specification and transmission of the side-band information,...
Topics: DTIC Archive, Goldberg,A J, GTE PRODUCTS CORP NEEDHAM HEIGHTS MA COMMUNICATION SYSTEMS DIV,...
Section 1 discusses and demonstrates what parameter adjustments can accomplish in the method of removing motion blur from photographic images. Section 2 indicates the status of work in color vision, and the expected method of reporting these results. Section 3 describes a method for strikingly increasing the perceived quality of synthetic speech. Additional computation at the receiver is used to generate two channels (i.e. binaural) of sound for a stereo headphone set. This method requires no...
Topics: DTIC Archive, Stockham, Jr, Thomas G, UTAH UNIV SALT LAKE CITY SCHOOL OF COMPUTING, *INFORMATION...
This report summarizes the activities of the Decision Sciences Laboratory and describes achievements, progress, and results obtained by the Laboratory scientists in the past two years.
Topics: DTIC Archive, ELECTRONIC SYSTEMS DIV L G HANSCOM FIELD MA DECISION SCIENCES LAB, *HUMAN FACTORS...
This document reports on work toward a very low rate vocoder. We model speech as a Markov Chain of spectral templates for the unsupervised learning approach to very low rate vocoding. This quarter investigated some variations in the spectral clustering algorithms. We also decided to use the speech of many speakers for the clustering and sequential modeling. We also began work on synthesizing the speech of many speakers from a diphone data base recorded from a single speaker. In phonetic...
Topics: DTIC Archive, Makhoul, John, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *LOW RATE, *VOCODERS, *VOICE...
We report on research toward a very-low-rate vocoder. This quarter we continued investigation in three areas. The first area of research is multi- speaker synthesis: speech synthesis from the transmitted vocoder parameters with the voice quality of the vocoder user. This processing entails speaker-specific spectral transformation of the vocoder diphone database. The second area of research is to improve the accuracy of the phonetic recognition. Our work this quarter concentrated on training the...
Topics: DTIC Archive, Makhoul, John, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *SPEECH ARTICULATION,...
Contents: Packet Speech; and Acoustoelectric Convolvers.
Topics: DTIC Archive, Gold, Bernard, Stern, Ernest, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB,...
Longbrake II is a 12 month program consisting of both study and hardware phases. The objective of the study is to develop an optimum Linear Predictive Coding vocoder system by optimizing the current real time algorithms used, and by the evaluation of additional LPC related techniques. The purpose of the hardware effort is to develop, test, and evaluate two Exploratory Development Models (EDM) of a speech compression system which incorporates the Atal and Markel approaches to adaptive linear...
Topics: DTIC Archive, Foran,Joseph W, PHILCO-FORD CORP WILLOW GROVE PA COMMUNICATION SYSTEMS DIV,...
This report describes the design and construction of prototype portable voice communication units and the implementation of the BBN robust 16 kbit/s adaptive predictive coding algorithm as a full-duplex real-time speech coder on these units. The report documents the hardware and software design and implementation efforts, and presents the results of a hardware production cost study. Work on algorithm simplification of the BBN 2.4 kbit/s harmonic deviations LPC speech coder is also described....
Topics: DTIC Archive, Tiao,J, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *ANALOG TO DIGITAL CONVERTERS,...
We report on research toward a very-low-rate vocoder. This quarter we continued investigation in three areas. The first area of research is multi- speaker synthesis: speech synthesis from the transmitted vocoder parameters with the voice quality of the vocoder user. This processing entails speaker-specific spectral transformation of the vocoder diphone database. The second area of research is to improve the accuracy of the phonetic recognition. Our work this quarter concentrated on training the...
Topics: DTIC Archive, Makhoul, John, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *SPEECH ARTICULATION,...
A microprocessor realization for a linear predictive vocoder is presented. The goal was a low power. low cost, compact special purpose realization of a narrow band speech terminal. The resultant design is a general purpose two bus structure running at a 150 ns cycle time using as the basic signal processing element four of the AMD 2901 CPE chips. This basic structure is augmented by a four cycle multiplier to allow for sufficient signal processing power. The design concessions that mark the...
Topics: DTIC Archive, Hofstetter, Edward M, Tierney, Joseph, Wheeler, Omar C, MASSACHUSETTS INST OF TECH...
The long-range objectives of the Packet Speech Systems Technology Program are to develop and demonstrate techniques for efficient digital speech communication on networks suitable for both voice and data, and to investigate and develop techniques for integrated voice and data communication in packetized networks, including wideband common-user satellite links. Specific areas of concern are: the concentration of statistically fluctuating volumes of voice traffic, the adaptation of communication...
Topics: DTIC Archive, Weinstein, Clifford J, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB,...
A speech coding processor architecture design study has been performed in which Texas Instruments TMS32010 has been selected from among three commercially available digital signal processing integrated circuits and evaluated in an implementation study of real-time Adaptive Predictive Coding (APC). The TMS32010 has been compared with AR&T Bell Laboratories DSP I and Nippon Electric Co. PD7720 and was found to be most suitable for a single chip implementation of APC. A preliminary design...
Topics: DTIC Archive, Randolph,M A, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *SPEECH, *DIGITAL...
The report summarizes the results of a program of research concerned with the perception of degraded speech in normal and pathological listeners. The comparisons were derived from performance in a two-category judgment task where the anchor values remain accessible to the subject and under the control of the experimenter. Response decisions and frequency of anchor testings are recorded together with decision times in computer compatible tape format. In addition, studies were designed to examine...
Topics: DTIC Archive, Mostofsky, David I, EDUCATIONAL RESEARCH CORP CAMBRIDGE MA, *SPEECH RECOGNITION,...
This document reports on work toward a very low rate vocoder. We model speech as a Markov Chain of spectral templates for the unsupervised learning approach to very low rate vocoding. This quarter we compared several clustering techniques. We determined that a hierarchical approach to clustering is economical with minimal loss in performance. Furthermore, we found that a small number of spectral templates (from 128 to 256) is sufficient for vocoding with good intelligibility. Also, in the...
Topics: DTIC Archive, Makhoul, John, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *SPEECH TRANSMISSION,...
This document reports on work toward a very low rate vocoder. We model speech as a Markov Chain of spectral templates for the unsupervised learning approach to very low rate vocoding. This quarter we compared several clustering techniques. We determined that a hierarchical approach to clustering is economical with minimal loss in performance. Furthermore, we found that a small number of spectral templates (from 128 to 256) is sufficient for vocoding with good intelligibility. Also, in the...
Topics: DTIC Archive, Makhoul, John, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *SPEECH TRANSMISSION,...
A programmatically controlled analog switch unit for multiplexing several participants using a single vocoder into a network voice conference is described. Two methods of using this unit to support variations in conference protocols are discussed. An investigation was made of a proposed variable frame rate transmission system for network voice communication. This LPC-II system was compared with the existing system and several other possible choices. the comparisons are based on listening tests...
Topics: DTIC Archive, Culler, Glen J, McCammon, Michael, CULLER/HARRISON INC GOLETA CA, *COMMUNICATIONS...
This document reports the development of a data base of vocoded speech that has been passed through several popular voice coders and their corresponding decoders. The KING data base consists of samples from 26 male speakers recorded sequentially on DAT tape. Two products were produced: five DAT tapes. each consisting of the original KING data base on one channel and the vocoded version on the other; and raw binary files, one for each vocoded and speaker combination.
Topics: DTIC Archive, Grossnickle, P. C., NAVAL COMMAND CONTROL AND OCEAN SURVEILLANCE CENTER RDT AND E DIV...
Topics: DTIC Archive, McAulay, R. J., MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *VOCODERS,...
This report presents a summary of work performed at Lincoln Laboratory aimed at improving the intelligibility of 2.4 kbps vocoders to be used in USAF operational environments. The distortions present in some of these environments, particularly the F-15 fighter aircraft, can place a severe burden on the speech modelling capabilities of contemporary vocoders. To study these effects and the benefits of various algorithmic improvements, the Diagnostic Rhyme Test was used as a means of providing an...
Topics: DTIC Archive, Singer,E, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *Vocoders, *Acoustic...
This document is devoted to and analysis of the intelligibility of semantically anomalous sentences presented in four acoustically different conditions: (1) natural speech, no noise; (2) vocoded speech, no noise; (3) vocoded speech, noise added to the pitch track; (4) vocoded speech, noise to the spectrum. One objective was to analyze the specific types of errors in each conditions. The other objective was to compare results of this analysis with results obtained from the Diagnostic Rhyme Test...
Topics: DTIC Archive, Mack,M A, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *SPEECH RECOGNITION,...
This thesis proposes a new analysis/synthesis procedure for speech and image compression. The algorithm applies the discrete wavelet transform to subject data in order to obtain a set of multiresolution wavelet coefficients. The wavelet coefficients are then encoded by using the generalized Lloyd algorithm. The statistical properties of the wavelet coefficients are utilized to determine the number of resolution levels as well as the codebook size at each resolution level. Coding results show...
Topics: DTIC Archive, Erdemir, Alper, NAVAL POSTGRADUATE SCHOOL MONTEREY CA, *CODING, *QUANTIZATION,...
To help achieve universal secure communications interoperability in the Department of Defense (DoD), one intermediate goal has been the development of a universal voice encoder (vocoder) that can seamlessly encode speech at a wide range of variable and fixed data rates to suit a wide range of DoD communication equipment. This report describes the most recent advancements in achieving this goal. Four of the most important areas of improvements presented are: (1) Significant improvements were...
Topics: DTIC Archive, NAVAL RESEARCH LAB WASHINGTON DC, *DATA RATE, *VARIABLES, *VOCODERS, CODING, SECURE...
Nearly all types of military speech communication involve the use of so-called (narrow band) voice coders or vocoders. Usually the Speech Transmission Index (STI) uses artificial test signals, which can not be reproduced by vocoders with the usual fidelity. Therefore the STI is not able to evaluate vocoders at this time. Although it is theoretically feasible to measure the Speech Transmission Index with natural speech instead of the usual artificial test signals, each of the various...
Topics: DTIC Archive, van Gils, Bastiaan J, TNO HUMAN FACTORS SOESTERBERG (NETHERLANDS) THERMAL PHYSIOLOGY...
For nearly 20 years, DoD has had only one narrowband voice algorithm called Linear Predictive Coder (LPC). It is used in the Advanced Narrowband Digital Voice Terminals (ANDVTs) operating at 2400 bits per second (b/s). Currently, 40,000 ANDVTs have been deployed by the Navy, Army, Air Force, Marine Corps, and special government agencies. DoD is currently planning to develop a new narrowband voice terminal called the Future Narrowband Digital Terminal (FNBDT), which features a new voice...
Topics: DTIC Archive, Kang, George S, NAVAL RESEARCH LAB WASHINGTON DC, *CODING, *DIGITAL COMMUNICATIONS,...
Much of the redundancy in a speech or television signal is eliminated when that signal is encoded into digital form by a differential PCM encoder. Further coding of the differential PCM output using entropy coding techniques (Huffman or Shannon-Fano coding) can result in a further increase in the signal to quantizing noise ratio of 5.6 dB without increasing the transmission rate. This conclusion is reached by comparing the performance of a DPCM system without entropy coding with one using...
Topics: DTIC Archive, O'Neal, Jr, J B, NORTH CAROLINA STATE UNIV AT RALEIGH DEPT OF ELECTRICAL ENGINEERING,...
Coding under a fidelity criteria of a class of sources which emit randomly occurring messages is investigated. This class of sources models information carrying processes entering into communication networks. Messages emitted by computer terminals, teletypes, vocoders, and other such devices serve as actual examples. For this class of sources the rate distortion function is derived, and source coding and converse source coding theorems are proven. Employing these theorems, an operational...
Topics: DTIC Archive, Nussbaum,Howard S, CALIFORNIA UNIV LOS ANGELES SCHOOL OF ENGINEERING AND APPLIED...
Alternative switching strategies for future integrated DOD voice and data networks are studied. Three fundamental problems are addressed: (1) The economics of integrating voice and data applications in a common communication system; (2) the comparison of alternative switching technologies for integrated voice and data networks; (3) the economics of low voice digitization rate devices. Strategies examined are traditional, fast, and ideal circuit switching, hybrid (circuit-packet) switching, and...
Topics: DTIC Archive, Frank, Howard, NETWORK ANALYSIS CORP GREAT NECK NY, *COMMUNICATIONS NETWORKS, *DATA...
This report describes our work for the past year on speech compression and synthesis. We implemented a real-time variable-frame-rate LPC vocoder operating at an average rate of 2000 bits/s. We also tested our mixed- source model as part of the vocoder. To improve the reliability of the extraction of LPC parameters, we implemented and tested a range of adaptive lattice and autocorrelation algorithms. For data rates above 5000 bits/s, we developed and tested a new high-frequency regeneration...
Topics: DTIC Archive, Berouti, Michael, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *SPEECH COMPRESSION,...
This document reports on work toward a very low rate vocoder. We model speech as a Markov Chain of spectral templates for the unsupervised learning approach to very low rate vocoding. This quarter we compared several clustering techniques. We determined that a hierarchical approach to clustering is economical with minimal loss in performance. Furthermore, we found that a small number of spectral templates (from 128 to 256) is sufficient for vocoding with good intelligibility. Also, in the...
Topics: DTIC Archive, Makhoul, John, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *SPEECH TRANSMISSION,...
This report describes a system which processes speech using linear predictive methods. The system is a software simulation of an LPC analyzer and synthesizer. The system consists of two programs, one of which processes the speech to generate the LPC parameters, and another which processes these parameters to resynthesize the speech. An important aspect of the system is that it enables the user to select from various pitch and coefficient analysis methods. It also allows the user to vary other...
Topics: DTIC Archive, McKown,C E, AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH SCHOOL OF ENGINEERING,...
A detailed description of the implementation of a robust 2400 b/s LPC algorithm is presented. The algorithm was developed to improve vocoder performance in acoustically compromised environments. Improved robustness in noise is achieved by: (1) increasing the speech bandwidth to 5 kHz; (2) increasing the LPC model order to 12; and (3) doubling the analysis rate. Frame fill techniques are used to achieve the 2400 b/s data rate. The algorithm is embodied in the Advanced Linear Predictive Coding...
Topics: DTIC Archive, Singer, Elliot, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *BACKGROUND NOISE,...
The long-range objectives of the Packet Speech Systems Technology Program are to develop and demonstrate techniques for efficient digital speech communication on networks suitable for both voice and data, and to investigate and develop techniques for integrated voice and data communication in packetized networks, including wideband common-user satellite links. Specific areas of concern are: the concentration of statistically fluctuating volumes of voice traffic, the adaptation of communication...
Topics: DTIC Archive, Weinstein, Clifford J, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB,...
This report describes work performed under three programs: Packet Speech, Acoustic Convolvers, and Airborne Command and Control sponsored by the Information Processing Techniques Office of the Defense Advanced Research Projects Agency during the semiannual period 1 January through 30 June 1976. The first two programs are reported in Vol. 1 and the third in Vol. 2.
Topics: DTIC Archive, Gold, Bernard, Stern, Ernest, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB,...
We report on research toward a very-low-rate vocoder. This quarter we continued investigation in three areas. The first area of research is multi- speaker synthesis: speech synthesis from the transmitted vocoder parameters with the voice quality of the vocoder user. This processing entails speaker-specific spectral transformation of the vocoder diphone database. The second area of research is to improve the accuracy of the phonetic recognition. Our work this quarter concentrated on training the...
Topics: DTIC Archive, Makhoul, John, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *SPEECH ARTICULATION,...
A flexible, high-speed digital voice processor has been designed for use in conjunction with the JTIDS communication terminal to be installed for testing on board F-15 fighter aircraft and in Army JTIDS installations. The processor, known as the Advanced Linear Predictive Microprocessor (ALPCM), has an architecture which is similar to that of its predecessor but contains significant improvements in speed, memory, and software development aids. The design includes an arithmetic section (four AMD...
Topics: DTIC Archive, Hofstetter,Edward M, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *Voice...
This appendix to Lincoln Laboratory Technical Note 1976-37 provides all of the detailed drawings, layouts and cabling information to construct an identical vocoder to the one described in the technical note. The additional comments may clear up any additional questions concerning this appendix.
Topics: DTIC Archive, Hofstetter, Edward M., MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB,...
The goal of this program is the investigation and development of techniques for communications-adaptive internetting with particular emphasis on digital voice communications. The specific class of problems addressed relates to packet speech networks whose interconnecting links may be stressed as a result of traffic overloads or are time varying due to natural or hostile actions. This program extends the technology of fixed-topology packet-switching speech communications networks by introducing...
Topics: DTIC Archive, Gold, Bernard, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *COMMAND AND CONTROL...
Section 1 reviews the atmospheric turbulence deblurring problem. Based on this review, a research plan is being developed for research in the area. Section 2 reports on a mathematical technique for modeling the 'zeros' in synthetic speech. This work is continuing so these preliminary technical results are not yet conclusive. Section 3 abstracts the finished work on Short Time Spectrum Acoustic Processing. Section 4 abstracts work on two estimation parameters for audio signals. Section 5...
Topics: DTIC Archive, Stockham, Jr, Thomas G, UTAH UNIV SALT LAKE CITY SCHOOL OF COMPUTING, *IMAGE...
A non-real time 10 pole recursive autocorrelation linear predictive coding vocoder was created for use in studying effects of recursive autocorrelation on speech. The vocoder is composed of two interchangeable pitch detectors, a speech analyzer, and speech synthesizer. The time between updating filter coefficients is allowed to vary from .125 msec to 20 msec. The best quality was found using .125 msec between each update. The greatest change in quality was noted when changing from 20...
Topics: DTIC Archive, Janssen,W A, AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH SCHOOL OF ENGINEERING,...
A novel technique for coping with fading and burst noise on high frequency channels used for digital voice communication has been developed and evaluated. The technique transmits digital voice only during the high signal-to-noise ratio time intervals, i.e., channel on times, and speeds up the speech when necessary in order to avoid delays which would hinder conversation. The technique was evaluated using a model of the human speech comprehension process, which was tested using a spoken version...
Topics: DTIC Archive, Lynch,J T, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *Speech transmission,...
This dissertation describes a speech system based on a combination of physiological and psychoacoustic results which has been developed. The system contains a nonuniform Filter/Detector bank. A new relationship between Filter/Detectors and the Short-time Fourier Transform magnitude is derived, and a generalized version of the Short-Time Fourier Transform magnitude is used to implement the anlaysis system. The new relationship is also applied to a discussion of channel vocoders, spectrograms,...
Topics: DTIC Archive, Anderson,J C, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *SPEECH RECOGNITION,...
This technical report covers a variety of topics and development areas. Original research on the decomposition of a Canonical Coordinate Transformation process based on non-Euclidean error minimization criteria is covered. The implementation of an algorithm resulting form this work and its application to sample error metrics is described. The definition of an algorithm resulting from this work and its application to sample error metrics is described. The definition of a spectral moment error...
Topics: DTIC Archive, Tardelli,John D, ARCON CORP WALTHAM MA, *COMPUTER PROGRAMS, *LINEAR SYSTEMS, *DIGITAL...
In the preliminary experiments described in this report, Hopfield networks have proved to be fairly robust implementations of associative memory modules. For speech-processing applications, an appropriate binary representation of the speech is central for obtaining good performance. For different applications (e.g., recognition, vocoding, pitch detection) the representation is bound to be different. Our attempt to devise an automatic DRT with Hopfield networks indicates that incorporating null...
Topics: DTIC Archive, Gold, Bernard, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *SPEECH...
Existing analysis of the carrier sense multiple access mode (CSMA) has led to the determination of the average channel performance in terms of average throughput and average packet delay. This was achieved by formulating a semi-markovian model for CSMA channels with a finite population of users. In this paper, it is shown that, using the same model, it is possible to derive the actual packet delay distribution. The analysis is similar in nature to that provided for slotted ALOHA channels. These...
Topics: DTIC Archive, Tobagi,Fouad A, STANFORD UNIV CA COMPUTER SYSTEMS LAB, *RADIO LINKS, *CHANNELS,...
This report describes work performed on the Packet Speech Systems Technology Program sponsored by the Information Processing Techniques Office of the Defense Advanced Research Projects Agency during the period 1 October 1979 through 31 March 1980.
Topics: DTIC Archive, Weinstein, Clifford J, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB,...
This document reports on work toward a very low rate vocoder. We model speech as a Markov Chain of spectral templates for the unsupervised learning approach to very low rate vocoding. This quarter we compared several clustering techniques. We determined that a hierarchical approach to clustering is economical with minimal loss in performance. Furthermore, we found that a small number of spectral templates (from 128 to 256) is sufficient for vocoding with good intelligibility. Also, in the...
Topics: DTIC Archive, Makhoul, John, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *SPEECH TRANSMISSION,...
We report on research toward a very-low-rate vocoder. This quarter we continued investigation in two areas: the phonetic vocoder and an unsupervised method for vocoding. We introduced phoneme pair probabilities to improve the accuracy of phonetic recognition. We investigated the use of the phonetic vocoder as a tool for semi-automatic labeling of speech. We also experimented with several variations of the phonetic vocoder to improve the intelligibility of the vocoded speech with a moderate...
Topics: DTIC Archive, Makhoul, John, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *SPEECH COMPRESSION,...
We report on research toward a very-low-rate vocoder. This quarter we continued investigation in two areas: the phonetic vocoder and an unsupervised method for vocoding. We introduced phoneme pair probabilities to improve the accuracy of phonetic recognition. We investigated the use of the phonetic vocoder as a tool for semi-automatic labeling of speech. We also experimented with several variations of the phonetic vocoder to improve the intelligibility of the vocoded speech with a moderate...
Topics: DTIC Archive, Makhoul, John, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *SPEECH COMPRESSION,...
We report on research toward a very-low-rate vocoder. This quarter we continued investigation in two areas: the phonetic vocoder and an unsupervised method for vocoding. We introduced phoneme pair probabilities to improve the accuracy of phonetic recognition. We investigated the use of the phonetic vocoder as a tool for semi-automatic labeling of speech. We also experimented with several variations of the phonetic vocoder to improve the intelligibility of the vocoded speech with a moderate...
Topics: DTIC Archive, Makhoul, John, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *SPEECH COMPRESSION,...