Sajama

 

Email: sajama@gmail.com

Phone: 858-472-2371

 

About me:

I am interested in the design, analysis and practical application of algorithms that model observations and make predictions. I am a Scientist at Fair Isaac in San Diego where I have worked on designing and prototyping algorithms for time-varying classification of time series data and fuzzy entity matching. Prior to joining Fair Isaac, I was a PhD student in Prof. Alon Orlitsky's group in the Electrical and Computer Engineering department at UCSD where I worked in the areas of dimensionality reduction of heterogenous data, supervised feature transformation, semi-supervised learning and density estimation with sparse data. I also have a Master of science degree from the Wireless Networks Lab at Cornell University and Bachelor of Technology from Indian Institute of Technology, Bombay.

 

 



Research:

Estimating and computing density based distance metrics --- Density based distance metrics have been proposed for semi-supervised learning, non-linear interpolation and clustering. In this paper we bound rate of convergence of non-parametric estimates of density based distance metrics. We further give an asymptotically  consistent graph based method for computing these metrics and bound the rate at which this approximation error goes to zero. [ICML 2005 paper] [ Semisupervised learning Book chapter - Abstract     pdf]

 

Supervised dimensionality reduction using mixture models  ---  A supervised, linear dimensionality reduction method based on maximum mutual information estimation of low dimensional exponential family mixture models. This is an adaptation to the supervised case of the technique used in the unsupervised scheme 'Semi-parametric exponential family PCA'.  [ ICML 2005 paper -    pdf] [ Technical Report - No. CS2004-0810, December 2004, UCSD Abstract    ps     pdf

 

Semi-parametric exponential family PCA   --- A linear, probabilistic dimensionality reduction scheme based on latent variable modeling with non-parametric latent distribution estimation.  [ NIPS 2004 paper -  Abstract  pdf   ps   bibtex ]   [ Technical Report  No. CS2004-0790, June 2004, UCSD] 

 

Practical algorithms for modeling sparse data ---  A method for estimating the distribution underlying sparse data. Instead of maximum likelihood, we consider a different estimate that maximizes the probability of the number of symbols appearing any given number of times. [ Proceedings of IEEE Symposium on Information Theory, 2004 paper - ps ] [ Proceedings of Allerton Conference, 2003 - ps ]

 

Independent tree adhoc multicast routing ---  A technique for using alternate trees to improve the time between failure of a multicast service.   [ Mobile Networks and Applications, Volume 8 ,  Issue 5, October 2003 - pdf ] [ Vehicular Technology Conference, 2001 - pdf ]

 

Z. J. Haas, J. Deng, B. Liang, P. Papadimitratos, and S. Sajama, “Wireless Ad Hoc Networks,” in Wiley Encyclopedia of Telecommunications, John G. Proakis, Editor, John Wiley & Sons, New York, 2002. [ pdf ]