The results of previous technical reports are summarized and the results of tests of various recently developed speech compression systems are presented and analyzed. The results can be summarized as follows: (1) Semi- vocoders, operating at 9600 bits/sec, and channel vocoders, at 2400 bits/sec, will provide speech of adequate intelligibility and quality for most military communications. The voice quality of the semi-vocoders will usually be somewhat superior to that of the channel vocoders....
Topics: DTIC Archive, Kryter, Karl D, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *SPEECH COMPRESSION,...
An introduction to vocoders is presented. An elementary discussion of speech fundamentals is followed by a brief description of the different branches of speech research work. Explanations are presented of channel vocoders, voice-excited vocoders and, finally, the vocoder built for this research.
Topics: DTIC Archive, Gold, Bernard, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *SPEECH, *VOCODERS,...
The report summarizes the results of a program of research on communication system evaluation from the standpoint of speech intelligibility and speaker recognizability. The history and present status of the Diagnostic Rhyme Test (DRT) Form III are described along with the results of research relating to the validity of the DRT in various applications.
Topics: DTIC Archive, Voiers, William D, SPERRY RAND RESEARCH CENTER SUDBURY MA, *SPEECH, SPEECH...
In order to determine if speech whose fundamental is absent can have its pitch accurately restored so as to be used as an input to a vocoder, a computer simulation was performed. The fundamental was restored by passing the speech through a fullwave rectifier followed by a slope filter. The accuracy of the pitch restoration of this method was compared with that of simply measuring the pitch of speech whose fundamental was present by slope filtering alone. A third pitch detection method, that of...
Topics: DTIC Archive, Goldberg, A J, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *SPEECH RECOGNITION,...
The document describes Category II field testing of AN/USC-12 HF radio modem equipment, AN/GSC-20 wire-line modem equipment and the AN/USM-235 digital test set. Most of this testing was done during November and December 1966 between Cape Kennedy AFS and the Antigua, W.I. range station; however, a few interface tests were done at a later date subject to the availability of certain test facilities. The results of this test program are included with appropriate comments.
Topics: DTIC Archive, Greim, Robert E, MITRE CORP BEDFORD MA, *DATA TRANSMISSION SYSTEMS, HIGH FREQUENCY,...
The report presents the results of a survey in which the Diagnostic Rhyme Test was used to evaluate the present state of digital technique for speech processing and communication. Also presented are the results of a series of minor studies concerned with the methodology of intelligibility evaluation.
Topics: DTIC Archive, Voiers, William D, TRACOR INC AUSTIN TX, *VOCODERS, DIGITAL SYSTEMS,...
Research into and the development of instrumentation for the investigation of factors affecting the quality of vocoded speech are documented. The work reported was specifically concerned with developing a better understanding of the role of the vocal source in the production of both synthetic speech and of natural speech. The design of and operating instructions for the VOTIF vocal track inverse filter - built as part of the program - are presented. A theoretical determination of the...
Topics: DTIC Archive, Crystal, Thomas H, SIGNATRON INC LEXINGTON MA, *SPEECH, *VOCODERS, INTERACTIONS,...
The report summarizes the results of a program of research concerned with the perception of degraded speech in normal and pathological listeners. The comparisons were derived from performance in a two-category judgment task where the anchor values remain accessible to the subject and under the control of the experimenter. Response decisions and frequency of anchor testings are recorded together with decision times in computer compatible tape format. In addition, studies were designed to examine...
Topics: DTIC Archive, Mostofsky, David I, EDUCATIONAL RESEARCH CORP CAMBRIDGE MA, *SPEECH RECOGNITION,...
Data compression, in the paper, is used to denote the reducing of an input data set prior to transmission, as opposed to data reduction, the analytical processing of the set upon reception. Source encoding techniques are classified into sampling and analog to digital conversion, codebook techniques, predictive subtractive coding and delta modulation, along with aperature and partitioning techniques. Some of the most recent work discussed includes adaptive predictive coding, picture bandwidth...
Topics: DTIC Archive, Schwartz, Jay W, NORTH CAROLINA STATE UNIV AT RALEIGH DEPT OF ELECTRICAL ENGINEERING,...
Much of the redundancy in a speech or television signal is eliminated when that signal is encoded into digital form by a differential PCM encoder. Further coding of the differential PCM output using entropy coding techniques (Huffman or Shannon-Fano coding) can result in a further increase in the signal to quantizing noise ratio of 5.6 dB without increasing the transmission rate. This conclusion is reached by comparing the performance of a DPCM system without entropy coding with one using...
Topics: DTIC Archive, O'Neal, Jr, J B, NORTH CAROLINA STATE UNIV AT RALEIGH DEPT OF ELECTRICAL ENGINEERING,...
The activities of the Psychometrics Department of TRACOR, Inc., fall into two major categories. In the first category are research activities undertaken with the aim of developing improved methods for evaluating voice communication systems and devices. In the second category are testing services performed with processed speech materials supplied by the contract monitor. The research activities included five major research projects from which technical papers resulted presented in the report.
Topics: DTIC Archive, Voiers, William D, TRACOR INC AUSTIN TX, *INTELLIGIBILITY, *SPEECH, *VOICE...
The author discusses methods and problems of acoustic signal processing for systems to enable machines to understand spoken communication. Emphasis is on research outside of the ARPA-sponsored SUR (Speech Understanding Research) study. This acoustic level processing includes three steps, not necessarily distinct: (1) preprocessing the original analog signal or its digitized form by basic techniques such as amplitude compression; (2) analysis of the preprocessed signals using fast Fourier...
Topics: DTIC Archive, Hoffman, A S, RAND CORP SANTA MONICA CA, *ACOUSTIC SIGNALS, *SIGNAL PROCESSING,...
Natural Communication with Computer IV broad based computer science research and development work is performed in areas including: speech understanding systems, speech compression, development of programs and programming aids, techniques for extending computer I/O capabilities, research and development on time sharing systems, and distributed computation. This research program involves the ability to represent knowledge and deal with it in computer oriented terms, requires systems capable of...
Topics: DTIC Archive, BOLT BERANEK AND NEWMAN INC CAMBRIDGE MA, *COMPUTER PROGRAMMING, *SPEECH RECOGNITION,...
This study is aimed at the broad goal of the DoD Secure Voice Consortium to develop hardware models of improved narrow-band voice coders. The study is focused on the 'pitch and voicing' problem. The objective is to conceive and demonstrate the feasibility of two or more improved strategies to estimate and encode the excitation parameters of human speech. The decoded parameters will be used to excite a time-varying vocal tract 'filter' in the synthesizer.
Topics: DTIC Archive, Magill, D T, STANFORD RESEARCH INST MENLO PARK CA, *CODING, *SPEECH RECOGNITION,...
This report summarizes the activities of the Decision Sciences Laboratory and describes achievements, progress, and results obtained by the Laboratory scientists in the past two years.
Topics: DTIC Archive, ELECTRONIC SYSTEMS DIV L G HANSCOM FIELD MA DECISION SCIENCES LAB, *HUMAN FACTORS...
This document is an overview of the NELC IR and IED programs. It summarizes the accomplishments achieved within each project in FY74. Longer articles are presented on three of the most significant projects -- ehf Integrated Circuits, Programmable Electro-Optical Processor, and Telecommunication Equipment Low-Cost Acquisition Method (TELCAM).
Topics: DTIC Archive, NAVAL ELECTRONICS LAB CENTER SAN DIEGO CA, *NAVAL RESEARCH, SIGNAL PROCESSING, FIBER...
This report describes work in developing a linear predictive speech compression system that transmits high quality speech at low bit rates. The authors have developed several methods for reducing the redundancy in the speech signal without sacrificing speech quality. Included among these methods are preemphasis of the incoming speech signal, adaptive optimal selection of predictor order, optimal selection and quantization of transmission parameters, variable frame rate transmission, optimal...
Topics: DTIC Archive, Sutherland, Wiliam R, Makhoul, John, Viswanathan, R, Cosell, Lynn, Russell, William,...
A problem basic to the development of all digital telecommunication systems is the efficient digital encoding of speech signals for transmission. Many speech digitization algorithms have been proposed. This study, using as a research vehicle the homomorphic vocoder algorithm, is directed toward determining the time resolution and the frequency resolution required to faithfully reproduce a speech signal. The results reported are fundamental in that they are applicable to any speech digitization...
Topics: DTIC Archive, Bush, Aubrey M., GEORGIA INST OF TECH ATLANTA SCHOOL OF ELECTRICAL ENGINEERING,...
This report describes research into the problem of rectification of sound recordings made under adverse conditions and communicated and recorded with a great deal of noise. In the course of this research, a number of refinements have been made to the process of digital speech synthesis through new vocoder techniques. The particular case under investigation was that of old, noisy recordings of a singing voice and the immediate goal was the separation of that voice from wide band noise,...
Topics: DTIC Archive, Miller, Neil Joseph, UTAH UNIV SALT LAKE CITY DEPT OF COMPUTER SCIENCE, *VOICE...
An ultra-high performance programmable speech processor, consisting in the main of a custom designed 55-nsec microcomputer, has been designed and constructed. To date, five real-time speech compression programs have been implemented and evaluated.
Topics: DTIC Archive, Blankenship, Peter E., Hofstetter, Edward M., Huntoon, Albert H., Malpass, Marilyn...
Two broad categories of work are described: algorithmic studies and hardware design. Several algorithms for digitizing and reducing the data rate of speech signals are described. These algorithms include an adaptive residual coder (ARC) designed to produce data at 16 and 9.6 kbps, an adaptive predictive coder (APC) at 8 kbps, a voice-excited linear predictor (VELP) at 8 kbps, and a straight linear predictive coded (LPC) vocoder at 2.4, 3.6, and 4.8 kbps. In addition, some work on pitch or...
Topics: DTIC Archive, Gold, Bernard, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *VOICE...
A telephone channel simulator has been implemented on the Lincoln Digital Voice Terminal (LDVT). This technical note first reviews the various distortions that occur in telephone channels and then describes the implementation. The results obtained indicate that the potential of the LDVT goes beyond the original intent of its use as a speech digitizer.
Topics: DTIC Archive, Seneff, Stephanie, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *TELEPHONE...
The Information Processing Techniques Program sponsored by DARPA at Lincoln Laboratory consists of three efforts: Packet Speech (Network Speech Compression), Acoustic Convolvers, and Airborne Command and Control. In this Semiannual Technical Summary, the first two areas are reported in Vol. 1 and the third in Vol. 2. In addition, Vol. 1 contains a brief summary report on work in Speech Understanding completed in FY 1974.
Topics: DTIC Archive, Gold, Bernard, Stern, Ernest, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB,...
Contents: Packet Speech; and Acoustoelectric Convolvers.
Topics: DTIC Archive, Gold, Bernard, Stern, Ernest, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB,...
A programmatically controlled analog switch unit for multiplexing several participants using a single vocoder into a network voice conference is described. Two methods of using this unit to support variations in conference protocols are discussed. An investigation was made of a proposed variable frame rate transmission system for network voice communication. This LPC-II system was compared with the existing system and several other possible choices. the comparisons are based on listening tests...
Topics: DTIC Archive, Culler, Glen J, McCammon, Michael, CULLER/HARRISON INC GOLETA CA, *COMMUNICATIONS...
A working computer system for transmission and receiving of speech messages over the ARPANET using variable frame rate linear predictive coding is described. This system conforms to the LPC-2 protocols. A detailed discussion is given of the known problems and pitfalls in speech transmission. A description is given of the methods used to deal with these problems.
Topics: DTIC Archive, McCammon, Michael, Culler, Glen J., CULLER/HARRISON INC GOLETA CA,...
This report describes work performed under three programs: Packet Speech, Acoustic Convolvers, and Airborne Command and Control sponsored by the Information Processing Techniques Office of the Defense Advanced Research Projects Agency during the semiannual period 1 January through 30 June 1976. The first two programs are reported in Vol. 1 and the third in Vol. 2.
Topics: DTIC Archive, Gold, Bernard, Stern, Ernest, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB,...
This report describes the Codex Speech Digitizer Advanced Development Model. The speech coder operates at rates of 16 and 9.6 kbps using three selectable techniques: Adaptive Residual Coding (ARC), Continuously Variable Slope Delta (CVSD) modulation, and Adaptive Delta Modulation (ADM). (Author)
Topics: DTIC Archive, Forney,G D, Qureshi,S, CODEX CORP NEWTON MASS, *COMPUTER PROGRAMS, *CODING, *SPEECH...
This report describes software designed by GTE Sylvania for a high quality 16,000 bit per sec. speech terminal. This software operates in a full duplex mode on two Sylvania Programmable Signal Processors. (Author)
Topics: DTIC Archive, Goldberg,A, GTE SYLVANIA INC NEEDHAM HEIGHTS MASS ELECTRONIC SYSTEMS GROUP-EASTERN...
Past efforts to achieve practical formant vocoders have been plagued with problems of formant tracker instability, resulting in unnatural warbles in the synthesized speech. A new approach to formant frequency determination, combined with a digital implementation, promises to eliminate these effects and to yield a useful formant vocoder. Additional redundancy reduction of information is obtained by means of a pattern-matching technique, which encodes the three formant frequencies into seven bits...
Topics: DTIC Archive, Kang,George S, Coulter,David C, NAVAL RESEARCH LAB WASHINGTON D C, *VOCODERS, LINEAR...
A microprocessor realization for a linear predictive vocoder is presented. The goal was a low power. low cost, compact special purpose realization of a narrow band speech terminal. The resultant design is a general purpose two bus structure running at a 150 ns cycle time using as the basic signal processing element four of the AMD 2901 CPE chips. This basic structure is augmented by a four cycle multiplier to allow for sufficient signal processing power. The design concessions that mark the...
Topics: DTIC Archive, Hofstetter, Edward M, Tierney, Joseph, Wheeler, Omar C, MASSACHUSETTS INST OF TECH...
Coding under a fidelity criteria of a class of sources which emit randomly occurring messages is investigated. This class of sources models information carrying processes entering into communication networks. Messages emitted by computer terminals, teletypes, vocoders, and other such devices serve as actual examples. For this class of sources the rate distortion function is derived, and source coding and converse source coding theorems are proven. Employing these theorems, an operational...
Topics: DTIC Archive, Nussbaum,Howard S, CALIFORNIA UNIV LOS ANGELES SCHOOL OF ENGINEERING AND APPLIED...
The problem of determining whether a given interval of a speech signal should be classified as voiced speech, unvoiced speech or silence is formulated as a test of statistical hypotheses. A robust detector is obtained by modelling the speech and the acoustic background noise signals as correlated Gaussian random processes. The methods of statistical decision theory are applied to these models to synthesize an optimum, minimum probability of error, classifier. The optimum classifier is an...
Topics: DTIC Archive, McAulay,Robert J, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *SIGNAL...
This appendix to Lincoln Laboratory Technical Note 1976-37 provides all of the detailed drawings, layouts and cabling information to construct an identical vocoder to the one described in the technical note. The additional comments may clear up any additional questions concerning this appendix.
Topics: DTIC Archive, Hofstetter, Edward M., MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB,...
A real time harmonic pitch detection algorithm has been developed on the Lincoln Digital Voice Terminal (LDVT). The algorithm was designed to be fast and to perform well when the input speech is degraded (i.e., telephone quality) or corrupted with acoustically coupled noise. The algorithm determines the fundamental frequency from the spacing between harmonics in a selected portion of the spectrum. The algorithm was incorporated into a real time linear prediction vocoder and compared favorably...
Topics: DTIC Archive, Seneff, Stephanie, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *PITCH...
This report contains a discussion of several continuously-variable-slope delta modulator (CVSD) modifications warranting theoretical and experimental evaluation as performance-improving techniques. These candidate techniques were derived from a survey of the applicable research literature on delta modulation (delta) M and other waveform encoding methods. The report begins with brief background information on low-data-rate speech encoding, delta modulation, and CVSD.
Topics: DTIC Archive, Morris,Joel M, NAVAL RESEARCH LAB WASHINGTON D C, *VOCODERS,...
Autocorrelation Pole-Zero modeling identifies the parameters of a rational transfer function H(z) whose short time-lag autocorrelations either exactly match (Autocorrelation Partial Realization--APR or closely APPROXIMATE(Autocorrelation Prediction--AP) those of a given spectrum. As a result, the spectrum of the H(z) obtained from either method approximates the gross structure of the given spectrum. APR uses the Pade approximation to determine the denominator coefficients of H(z). In contrast,...
Topics: DTIC Archive, Atashroo, Mohammad Ali, UTAH UNIV SALT LAKE CITY SCHOOL OF COMPUTING, *SIGNAL...
A promising method of automatic word recognition in continuous speech, recently designated as 'word spotting', has been demonstrated. The method uses error residual ratios from LPC (Linear Predictive Coding) vocoder analysis for waveform comparison and a dynamic programming procedure for time registration between the incoming speech and a template of the key word. Using a similarity threshold, the incoming speech is compared with several templates to account for variability in spectral shape....
Topics: DTIC Archive, Christiansen, Richard Wesley, UTAH UNIV SALT LAKE CITY SCHOOL OF COMPUTING, *DYNAMIC...
This note describes the Lincoln Integrated Speech Synthesizer (LISSYN), a general-purpose computer intended for speech processing, whose central processor is made from ECL gate arrays (large scale integrated circuits custom built at Lincoln Laboratory). The goal was to use gate arrays to implement in real time the synthesis portion of a linear predictive vocoder operating at 4800 bits/sec. The design process stressed minimizing the number of different kinds of gate arrays and the number of...
Topics: DTIC Archive, Berger,Robert, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *SPEECH, *DIGITAL...
This volume reports this year's work on the DCA speech evaluation contract. Three general areas of work are reported: (1) Work on narrowband terminal 'robustness'; (2) Work on wideband-narrowband tandeming; and (3) Hardware speech-terminal efforts. The robustness issues are defined early in this report; then, work on telephone-line simulation, robust pitch extraction, and operation of LPC vocoders in acoustically noisy environments is reported. This report also discusses some approaches and...
Topics: DTIC Archive, Gold,Bernard, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *SPEECH TRANSMISSION,...
The objectives of this study are (1) to investigate the system-wide implications of using Demand-Assignment Multiple-Access (DAMA) techniques with communication satellites to serve a diverse community of voice and data communication users, and (2) to explore extensions of the current DAMA techniques to Random Multiple Access (RMA) techniques which are suitable for data and packetized voice. The major emphasis of the study is exploring system-wide cost-performance tradeoffs in order to...
Topics: DTIC Archive, Hilborn,C G , Jr, SYSTEMS CONTROL INC PALO ALTO CALIF, *RANDOM ACCESS COMPUTER...
The effects of g-force stress on human voice patterns were investigated with the objective of finding means for making isolated word recognition word devices work in the fighter aircraft cockpit environment. Data were taken in a human centrifuge with SCOPE Electronics Inc's Voice Data Entry System (VDETS) used to prompt and pace the subjects. Data were subsequently digitized and stored for analysis and recognition experiments using the VDETS algorithm with a number of variations. Recognition...
Topics: DTIC Archive, Montague,Hill, SCOPE ELECTRONICS INC RESTON VA, *ACCELERATION, *STRESS(PHYSIOLOGY),...
The goal of this program is the investigation and development of techniques for communications-adaptive internetting with particular emphasis on digital voice communications. The specific class of problems addressed relates to packet speech networks whose interconnecting links may be stressed as a result of traffic overloads or are time varying due to natural or hostile actions. This program extends the technology of fixed-topology packet-switching speech communications networks by introducing...
Topics: DTIC Archive, Gold, Bernard, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *COMMAND AND CONTROL...
The Packet Speech Measurement Facility (PSMF) is a recording, playback, and measurement facility designed to provide members of the Network Secure Communications (NSC) Project with an investigative tool for packetized speech research. PSMF access will facilitate experiments dealing with the effects of network induced perturbations on real-time communications. This report describes efforts which have culminated in the specification of PSMF design, the development of an access protocol, and the...
Topics: DTIC Archive, Low,D A, COMPUTER CORP OF AMERICA CAMBRIDGE MASS, *SPEECH TRANSMISSION, *SPEECH,...
The application of Burst Processing to the problem of spectral decomposition of speech is discussed. It is shown that such a representation provides a viable alternative to conventional speech analyzers. A specific Burst implementation is presented. (Author)
Topics: DTIC Archive, Xydes,Christ John, ILLINOIS UNIV AT URBANA-CHAMPAIGN DEPT OF COMPUTER SCIENCE,...
A complete, real-time, channel vocoder delivering good speech quality with a 2400-bit/second data transmission rate was implemented using purely digital circuitry in the form of a high-speed programmed microprocessor. Necessary algorithms are presented and their effect on the machine design is discussed in detail. The end product is a very high-speed computing machine (measured in program throughput terms). It turned out to have a high degree of programming flexibility, which would make it...
Topics: DTIC Archive, Gorski-Popiel, Jerzy, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB,...
This is the first Semiannual Technical Summary Report on the Network Speech Processing Program to be submitted to the Defense Communications Agency. It covers the period 1 October 1976 through 31 March 1977 and reports on the following topics: Secure Voice Conferencing, Speech Algorithms, and Bandwidth Efficient Communications. Each of these tasks is directed to particular problems associated with AUTOSEVOCOM II and/or the design of future defense communications systems.
Topics: DTIC Archive, Herlin, Melvin A., MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *SECURE...
This paper reports on the effects--examined in parametric fashion--on the overall voice quality, acceptability, and communicability of speech packetization and its transmission through a packet-switched network. Speech processed through a number of real-time simulation programs developed to create anticipated anomalies (glitches) in packet speech systems were evaluated by informal acceptability testing. Depending on system design parameters, test results indicate that packet-system speech...
Topics: DTIC Archive, Forgie, James W, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *SPEECH...
This note describes measurements made on a computer simulation of a model of a Packetized Virtual Circuit (PVC) network link. The simulation models on a population of speakers in conversation, and a Poisson data source. Such variables as buffer space requirements, packet loss and delay, and link utilization are investigated as functions of voice and data loads on the system. Initial results indicate that voice and data can be satisfactorily integrated on a 1.544 Mbps packetized communications...
Topics: DTIC Archive, Demko, Paul, MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB, *COMPUTERIZED...
Development of the artificial cochlea was reported under previous grant. Work on this grant was oriented at testing the cochlea for automatic speech recognition. Use of the HP-9830 indicated that it is a good machine for low budget ASR if ways can be found to slow down input data rate. Data was taken on the 21 prolongable phonemes using the cochleas model. One surprise noted was a finding of high consistency in the first moment of the cochleas response to a given phoneme, irrespective of pitch....
Topics: DTIC Archive, Bolie,Victor W, NEW MEXICO UNIV ALBUQUERQUE DEPT OF ELECTRICAL ENGINEERING AND...