Bảng thống kê danh sách bài báo đăng trên kỷ yếu Quốc tế năm 2013

Stt Tên bài mã số, thời gian thực hiện Tác giả Tóm tắt nội dung Tạp chí đăng tải
Kỷ yếu cấp Quốc tế
1 A method for hand detection Using Internal Features and Active Boosting - Based Learing Nguyễn Văn Tới, Nguyễn Thị Thủy, Remy mullot Hand posture recognition has important applications in sign language, human machine interface, ect..In most such systems, the first and important step is hand detection. This paper presents a hand detection method based on internal features in an active boosting-based learning framwork. The use of efficient Haar like, local binary pattern and local orientation histogram as internal features allows fast computation of informative hand features for dealing with a great variety of hand appearances without background interference.. The fourth international Symposium on Information and Communication Technology, 2013, pg 213-221
2 Temporal gesture segmentation for recognition Lê Thị Lan, Nguyễn Văn Ngọc, Trần Thị Thanh Hải, Nguyễn Văn Tới, Nguyễn Thị Thủy This paper presents a method for temporal gesture segmentation based on the total activity of the video sequence. The new point of this method is that we apply some filters on the sequence and on the total activity plot that makes our method more robust to noise. This method has been shown to be very efficient on a very big data of the new contest Chalearn on hand gesture recognition.. International  Conference on Computing, Management & Telecommunication,21-24/01/ 2013, Pg 369-373
3 Vision baed dynamic hand gesture recognition Trần Thanh Hải, Nguyễn Văn Tới, Nguyễn Văn Ngọc, Quentin Midy This paper presents a conparative study on methods for dymamic hand gestures recognition. We propose 3 methods for studying : PCA-KNN, GIST-KNN, and KDES-A. The dataset we use for evaluating is a dataset of 20 Italian hand gestures provides by the big Chalearn contest... 2013 ICT Pamm workshop on Mobility Assistance and Service Robotics, pg 43-47
4 Warped Minimum Variance Distortionless Response Base Bottleneck Features for LVCSR Kevin Kilgour , Igor Tseyzer , Quoc Bao Nguyen and Alex Waibel This paper presents the results of our experiments on bottleneck features applied to a wMVDR (Warped Minimum Variance Distortionless Response) frontend. We examine how to best optimize wMVDR-BNF features and wMVDR combined with MFCC bottleneck features (wMVDR+MFCC-BNF).
 Our wMVDR+MFCC-BNF frontend improves a single pass system from 18.7% (20.7%) to 18.1% compared to a MFCC-BNF (MFCC) system tested on the Quaero 2010 German evaluation set.
When used in a system combination our wMVDR-BNF and wMVDR+MFCC-BNF systems reduced the overall WER from 14.3% to 13.3% on the IWSLT 2010 test set while at the same time reducing the number of systems needed from 9 to 5. Our result of 11.9% on the 2012 IWSLT testset is better than the best result submitted during the evaluation campaign.
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE
International Conference, pages:6990-6994
5 Segmentation of Telephone Speech Based on Speech and Non-Speech Models Michael Heck, Christian Mohr, Sebastian Stuker, Markus Muller, Kevin Kilgour,
Jonas Gehring, Quoc Bao Nguyen, Van Huy Nguyen, and Alex Waibel

In this paper we investigate the automatic segmentation of recorded telephone conversations based on models for speech and non-speech to find sentence like chunks for use in speech recognition systems. Presented are two different approaches, based on Gaussian Mixture Models (GMMs) and Support Vector
Machines (SVMs), respectively. The proposed methods provide segmentations that allow for competitive speech recognition performance in terms of word error rate (WER) compared to manual segmentation.
15th International Conference on Speech and Computer, SPECOM 2013, pages: 286-293
6 Optimizing Deep Bottleneck Feature Extraction Quoc Bao Nguyen, Jonas Gehring, Kevin Kilgour and Alex Waibel We investigate several optimizations to a recently published architecture for extracting bottleneck features for large-vocabulary speech recognition with deep neural networks. We are able to improve recognition performance of first-pass systems from a 12% relative word error rate reduction reported previously to 21%, compared to MFCC baselines on a Tagalog conversational telephone speech corpus. This is achieved by using different input features, training the network to predict context-dependent target, employing an efficient learning rate schedule and varying several architectural details. Evaluations on two larger German and French speech transcription tasks show that the optimizations proposed are universal applicable and yield comparable gains on other corpora (19.9% and 22.8%,respectively). The 10th IEEE RIVF International Conference on Computing and Communication Technologies, RIVF2013, pages:152-156
7 Models of Tone for Tonal and non-tonal Languages Florian Metze, Zaid A. W. Sheikh,
Alex Waibel,
Jonas Gehring, Kevin Kilgour,
Quoc Bao Nguyen, Van Huy Nguyen
Conventional wisdom in automatic speech recognition assertsthat pitch information is not helpful in building speech recognizers for non-tonal languages and contributes only modestly to good speech recognizers for tonal languages. To therefore maintain consistency between recognizers, pitch is often ignored, trading the slight performance benefits for greater system uniformity/simplicity. In this paper, we report results that challenge this conventional approach. We present new models of tone that deliver consistent performance improvements for tonal languages (Cantonese, Vietnamese) and even modest improvements for non-tonal languages. Using neural networks for feature integration and fusion, these models achieve significant gains throughout, and provide us with system uniformity and standardization across all languages, tonal and Non-tonal. 2013 IEEE International Conference Automatic Speech Recognition and Understanding(ASRU), pages:261-266
8 DNN Acoustic Modeling with Modular Multi-lingual Feature Extraction Networks Jonas Gehring,
Quoc Bao Nguyen,
Florian Metze and
Alex Waibel
In this work, we propose several deep neural network architectures that are able to leverage data from multiple languages. Modularity is achieved by training networks for extracting high-level features and for estimating phoneme state posteriors separately, and then combining them for decoding in a hybrid DNN/HMM setup. This approach has been shown to achieve superior performance for single-language systems, and here we demonstrate that feature extractors benefit significantly from being trained as multi-lingual networks with shared hidden representations. We also show that existing mono-lingual networks can be re-used in a modular fashion to achieve a similar level of performance without having
to train new networks on multi-lingual data. Furthermore, we investigate in extending these architectures to make use of language-specific acoustic features. Evaluations are performed on a low-resource conversational telephone speech transcription task in Vietnamese, while additional data for acoustic model training is provided in Pashto, Tagalog, Turkish, and Cantonese. Improvements of up to 17.4% and 13.8% over mono-lingual GMMs and DNNs, respectively, are obtained.
2013 IEEE International Conference Automatic Speech Recognition and Understanding(ASRU), pages:344-349
9 The 2013 KIT IWSLT Speech-to-Text Systems for German and English Kevin Kilgour, Christian Mohr, Michael Heck, Quoc Bao Nguyen, Van Huy Nguyen, Evgeniy Shin,
Igor Tseyzer, Jonas Gehring, Markus M¨uller, Matthias Sperber, Sebastian St¨uker and Alex Waibel
This paper describes our English Speech-to-Text (STT) systems for the 2013 IWSLT TED ASR track. The systems consist of multiple subsystems that are combinations of different front-ends, e.g. MVDR-MFCC based and lMel based ones, GMM and NN acoustic models and different phone sets. The outputs of the subsystems are combined via confusion network combination. Decoding is done in two stages, where the systems of the second stage are adapted in an unsupervised manner on the combination of the first stage outputsusing VTLN, MLLR, and cMLLR. Proceedings of the International Workshop for Spoken Language Translation (IWSLT 2013). Heidelberg, December 5-6, 2013, pages:107-112
10 The 2013 KIT Quaero Speech-to-Text System for French Joshua Winebarger, Bao Nguyen, Jonas Gehring, Sebastian St¨uker, and Alexander Waibel This paper describes our Speech-to-Text (STT) system for French, which was developed as part of our efforts in the Quaero program for the 2013 evaluation. Our STT system consists of six subsystems which were created by combining multiple complementary sources of pronunciation modeling including graphemes with various feature front-ends based on deep neural networks and tonal features. Both speaker-independent and speaker adaptively trained versions of the systems were built. The resulting systems were then combined via confusion network combination and crossadaptation. Through progressive advances and system combination we reach a word error rate (WER) of 16.5% on the 2012 Quaero evaluation data. Proceedings of the International Workshop for Spoken Language Translation (IWSLT 2013). Heidelberg, December 5-6, 2013, pages:235-242
11 Time series symbolization and search for frequent Patterns Mai Văn Hoàn, Matthieu Exbrayat In this paper, we focus on two aspects of time series mining: first on the transformation of numerical data to symbolic data, then on the search for frequency patterns in the result-ing symbolic time series. We are thus interested in some patterns which have a high frequency in our database of time series and might help to generate candidates for various tasks in the area of time series mining.. The 4th international symposium on information and comminication technology SOICT2013, December 2013, Pg 108-118
12 Compare effective Fuzzy associative memories for Grey-scale image recognition Phạm Việt Bình, Nông Thị Hoa Partern recognition is the most important field of image processing that is widely developed by many scientists. Resion is that PR provides complete information of objects from nóiy inputs. Several types of approach have been proposed to solve this problem, such as recognition baseom key features, recognition base on the distribution of histogram... First International Conference ICCASA 2012, HCM-Vietnam, 26-27/11/2012, pg 258-268
13 A max-min learning rule for fuzzy ART Nông Thị Hoa, Bùi Thế Duy In this paper, we propose a max-min learning rule of fuzzy ART that learns all patterns of training data and reduces effect of abnormal training patterns. Our learning rule changes the weight vector of the wining category  based on the minimal difference between the current input pattern and the old weight vector of the wining category.. Proceedings of the 2013 IEE Rive international conference on information and communication technologies, pg 53-57
14 Efficiency improvements for fuzzy associative memory Nông Thị Hoa, Bùi Thế Duy, Đặng Trung Kiên In this paper, we propose a novel FAM that effectively stores both contents and associations of patterns. We improve both learning and recalling processes of FAM. In learning process the associations and contents are stored by mean of input and output patterns and they are generalised by erosion operator.. 10th International Symposium on Neural networks Dalian, China, July 2013 Proceedings, Pg 36-43
15 High-speed block cipher algorithm based on hybrid method Đỗ Thị Bắc, Nguyễn Hiếu Minh This paper proposes 3 different designs of the new 64-bit block cipher diagram. A new feature of the designs is the application of hybrid CSPN. Designs with particular advantages will make the selection more appropriate for each target of applications  Ubiquitous information technologies and applications, CUTE 2013, pg 285-291
16 Soft-Skill Pedagogy And Educator–Employer Collaboration In Soft-Skill Development In Business Students Through Curricular Interventions In Vietnam Thi Thu HANG TRUONG, Ronald S. LAURA,  A growing body of scholarly research on the many benefits of soft skills acquisition makes evident the need for the leaders in higher education institutions, especially in developing countries such as Vietnam, to become better acquainted with this literature. Educational leaders have the responsibility to prepare the student workforce with the skills required to satisfy relevant contemporary business interests of employers. For this effort to be effective, business educators and employers will, however, require access to collaborative programs which provide educative guidance on business curriculum revision. Achieving the goal of collaboratively creating a productive pedagogy for the inculcation of soft skills within Vietnamese business schools represents a significant advance in the quality of its educational processes and potential for efficacious business exchange, globally. GABER Tenth International Conference Proceedings
Dubai, United Arab Emirates · December 2013, pg 28-38
17 The Effects of Educating for Soft Skills on Success in Career Development among
Graduates at Universities of Economics and Business Administration in Vietnam
Truong, Thi Thu Hang and Laura, Ronald Samuel It is a fundamental argument of this research paper that the current system of higher education in Vietnam needs to reconsider the extent to which its current curriculum is sufficiently comprehensive to prepare adequately the student workforce to meet the contemporary needs of employers. We shall argue that the focus on ‘hard skills’, though important, is too limited to do justice to the array of human resource skills now found to be essential to compete effectively in the national context and particularly in the global market place. Our specific pedagogic aim will be to show that students should be provided not only with a high level of technological competency, but with a deeper understanding of their full potential when grounded in the rich discourse of ‘soft skills’ pedagogy. We are confident that this more holistic framework will serve to foster innovative ideas, methods and evaluative techniques better designed for achieving excellence in tertiary education and in the competitive business context. We shall argue that human resource development with its emphasis on soft skills
plays an integral role in advancing economic growth, especially in developing countries (Keeley, 2007). Given the importance of the human resource factor in business development, and the distinct role of higher education in that process, we aim to show that reforms of pedagogic structure are clearly necessary to maximise business school performance outcomes.
Proceedings of the Second Annual Higher Degree
Student-led Conference, 9 November 2012, pg 105-119
18 Clustering Hierarchical data using SOM neural network Lê Anh Tú, Nguyễn Quang Hoan, Lê Sơn Thái This paper proposes a solution for clustering Hierarchical data using SOM neural network. The training process that combines data-partition and network-partition allows forming an automated Hierarchical tree structure representing the clustering process more detailed from the root node to the leaf node.. First International Conference ICCASA 2012, HCM-Vietnam, 26-27/11/2012, pg 282-289
19 Void fraction detecting in microfluidic channel flows based on a vertical capacitive structure Vũ Quốc Tuấn, Nguyễn Đắc Hải,Phạm Quốc Thịnh, Nguyễn Đình Đức, Chử Đức Trình  This paper precents a design of a micro-fluidicvertical capacitive sensor based on a printed circuit board. A three-electrode capacitor is covered a sub minimeter diameter plastic tube. The electrodes of câpcitive sensors were fabricated by using the traditional via connection on PCB technology. The capacittance change can be monitored by using a differential capacitive amplifier, a lock in amplifier, filter and an NI acquisition card.. International symposium on frontiers of materials science, 17-19/11/2013, HaNoi - Vietnam, pg 147-149
20 A Hybrid TTS between unit selection and HMM-based TTS under limited data conditions Phùng Trung Nghĩa, Lương Chi Mai, Masato Akagi The intelligibility of HMM-based TTS can reach that of the original speed. However, HMM-based TTS is far from natural. On the contrary, unit selection TTS is the most-natural sounding TTS currently. However, its intelligibility and naturalness on segmental duration and timing are not stable. Additionally, unit selection needs to store a huge amount of data for concatenation. Recently, hybird approaches between these tow TTS, i.e. the HMM trajectory tilling TTS(HTT), have been studied to take advantages of both unit selectionand HMM-based TTS... 8th ISCA Speed synthesis workshop, August 31- Septemper 2, 2013, Spain, pg 279-284
21 Improving the flexibility of unit selection TTS with temporal decomposition Phùng Trung Nghĩa, Lương Chi Mai, Masato Akagi In this paper, we propose a hybird TTS using TD. The analyses and experimental results show that the proposed method takes both advantages of HMM-based TTS and unit selection but ensures its flexibility. In the future, we will implement  the proposed method with Japanese and Chinese to confirm the unification and language- independence of the proposed method 8th ISCA Speed synthesis workshop, August 31- Septemper 2, 2013, Spain, pg 279-284
22 Improving the naturalness of speed synthesized by HMM-based TTS by producing an appropriate smoothness Phùng Trung Nghĩa, Masato Akagi A hybird method between HMM-based TTS and MRTD was proposed to solve the over-smoothness problem with HMM-based TTS under limited data conditions. The experimental results revealed that the proposed HTD could synthesized speed with an appropriate smoothness and articulated efficiently under limited data conditions in terms of both speed intelligibility and naturalness.. Proceeding of ASJ Spring Meeting 2013, March, Tokyo, Japan, pp. 299-302, 2013
23 A concatenative speed synthesis for monosyllabic languages with limited data Phùng Trung Nghĩa, Lương Chi Mai, Masato Akagi Quality of unit-based concatenative speed systhesis is low while that of corpus-based concatenative speed systhesis with unit selection is great natural. However, unit selection requires a huge data for concatenation that reduces the range of its applications. In this paper, by using temporal decomposition for modeling contextual effects intra-syllable and intersyllables, we propose a context-fitting unit modification method and a context-matching unit selection method... Proceeding of ASJ Autumn Meeting 2013, Toyohashi, Japan, 2013
24 Transformation of F0 contours for lexical tones concatenative speed synthesis of tonal langguages Phùng Trung Nghĩa, Lương Chi Mai, Masato Akagi Concatenative speed sybthesis (CSS) provides the greatest naturalness. However , it requires a huge stored database réulting a huge footprint. Reducing the capacity of stored database while preserving the quality of CSS, or improving the quality to size ratio(QSr), is still a challenge. In this paper, we propose a method of transforming fundamental frequency (F0) contours of lexical tones, developed from TD-GMM framework that successfully applied for transfroming spectral sequence in previous researches, in order to improve the QSr of CSS of tonal languages that results CSS available with limited data at offline stage, storing small online footprint, with preserving perceptual quality... International Conference on Speech Database and Assessments (Oriental COCOSDA) 2012, pp.129 – 134 , December, Macau, 2012. 
25 Alternative SOS aux conditions LMI pour la synthèse de lois de commade non-PDC pour les modèles flous T-S Dương Chính Cương, K.Guelton, N. Manamanni De nos jours, pour I'analyse non-quadratique des syste'mes flous de type Takagi-Sugento (T-S), les résultats suceessivement proposés sous forme LMIs tendent à des conditions de plus en plus complexes avec une réduction de conservatisme parfois discutable. Cet article présente de conservatisme parfois discutable.... 22 èmes rencontres francophones sur la Logique Floue et set applications (LFA2013), 10-11 octobre 2013, Remis, France, pg 163-170
26 Appliication of CURE data Clustering Algorithm to  Batangas state University student database Nguyễn Thị Linh, Christopher C. Chua Clustering is said to be one of the most complex, well-known and most studied problems in data mining theory. Data clustering is the process of grouping the data into classes or clusters, so that objects within a cluster have high similarity in comparison to one another but are very dissimilar to objects in other-clusters. The increasing enrolment of students at Batangas stage university (BatstateU) equates to increase of students' database which can be mined to discover patterns in large datasets..                                                    IISRO international conference on computer networks and information technology held in Bangkok, Thailand on June, 29-30, 2013, pg 90-95