Abstract
We examine the management of data accuracy in inter-organizational data exchanges using the context of distributed software projects. Organizations typically manage projects by outsourcing portions of the project to partners. Managing a portfolio of such projects requires sharing data regarding the status of work-in-progress residing with the partners and estimates of these projects' completion times. Portfolio managers use these data to assign projects to be outsourced to partners. These data are rarely accurate. Unless these data are filtered, inaccuracies can lead to myopic and expensive sourcing decisions. We develop a model that uses project-status data to identify an optimal assignment of projects to be outsourced. This model permits corruption of project-status data. We use this model to compute the costs of using perfect versus inaccurate project-status data and show that the costs of deviation from optimal are sizable when the inaccuracy in the data is significant. We further propose a filter to correct inaccurate project-status data and generate an estimate of true progress. With this filter, depending on the relative magnitudes of errors, we show that accuracy of project-status data can be improved and the associated economic benefit is significant. We illustrate the improvement in accuracy and associated economic benefit by instantiating the model and the filter. We further elaborate on how the model parameters may be estimated and used in practice.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Accuracy of aggregate data in distributed project settings: Model, analysis and implications
- Anderson, E. G. and Joglekar, N. R. 2005. A hierarchical product development planning framework. Prod. Oper. Manag. 14, 2.Google Scholar
- Ballou, D. P., Wang, R. Y., Pazer, H., and Tayi, G. K. 1998. Modeling information manufacturing systems to determine information product quality. Manag. Sci. 44, 4, 462--484. Google ScholarDigital Library
- Ballou, D. P. and Pazer, H. L. 1995. Designing information systems to optimize the accuracy-timeliness tradeoff. Inf. Syst. Res. 6, 1, 51--72.Google ScholarDigital Library
- Banker, R. D. and Kaufmann, R. J. 2004. The evolution of research on information systems: A fiftieth-year survey of the literature in management science. Manag. Sci. 50, 3, 281--289. Google ScholarDigital Library
- Bertsekas, D. 2001. Dynamic Programming and Optimal Control 2nd Ed. Athena Scientific, Belmont, MA. Google ScholarDigital Library
- Braha, D. 2001. Data Mining for Design and Manufacturing: Methods and Applications. Kluwer Academic. Google ScholarDigital Library
- Browning, T. R., Devst, J. J., Eppinger, S. D., and Whitney, D. E. 2002. Adding value in product development by creating information and reducing risk. IEEE Trans. Engin. Manag. 49, 4, 443--458.Google ScholarCross Ref
- Caballero, I., Vizcaíno, A., and Piattini, M. 2009. Optimal data quality in project management for global software developments. In Proceedings of the 4th International Conference on Cooperation and Promotion of Information Resources in Science and Technology. Google ScholarDigital Library
- Cai, Y. and Shankaranarayanan, G. 2007. Managing data quality in inter-organizational data networks. Int. J. Inf. Qual. 1, 3, 254--271.Google ScholarCross Ref
- Carrascosa, M., Eppinger, S. D., and Whitney, D. E. 1998. Using the design structure matrix to estimate time to market in a product development process. In Proceedings of the ASME Design Theory and Methodology Conference. 98--6013.Google Scholar
- Chen, Y. and Bharadwaj, A. 2009. An empirical analysis of contract structures in IT outsourcing. Inf. Syst. Res. 20, 4, 484--506. Google ScholarDigital Library
- Chiang, R. I. and Mookerjee, V. 2004. A fault threshold policy to manage software development projects. Inf. Syst. Res. 15, 1, 3--21. Google ScholarDigital Library
- Cusumano, M. A. and Selby, R. W. 1995. Microsoft Secrets: How the World's Most Powerful Software Company Creates Technology, Shapes Markets, and Manages People. Free Press, New York. Google ScholarDigital Library
- Dehoratius, N. 2004. In pursuit of information quality. Cutter IT J. 17, 9, 2--4.Google Scholar
- Dominguez, A. 2006. Project management in noisy environments. In Proceedings of the POM Annual Meeting.Google Scholar
- Even, A. and Shankaranarayanan, G. 2007. Utility driven assessment of data quality. Datab. Adv. Inf. Syst. 38, 2, 75--93. Google ScholarDigital Library
- Fisher, C. W., Chengalur-Smith, I., and Ballou, D. P. 2003. The impact of experience and time on the use of data quality information in decision-making. Inf. Syst. Res. 14, 2, 170--188. Google ScholarDigital Library
- Ford, D. and Sterman, J. 2003. The liar's club: Impacts of concealment in concurrent development projects. Concurr. Engin. Res. Appl. 11.Google Scholar
- Gaynor, M. and Shankaranarayanan, G. 2008. Implications of sensors and sensor-networks for data quality management. Int. J. Inf. Qual. 2, 1, 75--93.Google ScholarCross Ref
- Gomes, P. J. and Joglekar, N. R. 2008 Linking modularity with problem solving and coordination efforts. Manag. Decis. Econ. 29, 5, 443--457.Google ScholarCross Ref
- Graves, S. 1986. A tactical planning model for a job shop. Oper. Res. 34, 522--533. Google ScholarDigital Library
- Hernadez, M. A. and Stolfo, S. J. 1998. Real-world data is dirty: Data cleansing and the merge/purge problem. J. Data Mining Knowl. Discov. 1, 2. Google ScholarDigital Library
- Hirschheim, R., Heinzl, A., and Diebern, J. 2002. Information Systems Outsourcing. Springer.Google Scholar
- Holt, C. C., Modigliani, F., Muth, J. F., and Simon, H. A. 1960. Planning Production, Inventories, and Work Force. Prentice-Hall, Englewood Cliffs, NJ.Google Scholar
- Jones, M. C. and Price, R. L. 2004. Organizational knowledge sharing in erp implementation: Lessons from industry. J. Organ. End User Comput. 16, 1, 21--41.Google ScholarCross Ref
- Jorgensen, M. 2007. Forecasting of software development work effort: Evidence on expert judgment and formal models. Int. J. Forecast. 23, 3, 449--462.Google ScholarCross Ref
- Lee, Y. W., Pipino, L. L., Funk, J. D., and Wang, R. Y. 2006. Journey to Data Quality. MIT Press, Cambridge, MA. Google ScholarDigital Library
- Madnick, S., Wang, R. Y., and Lee, Y. W. 2009. Overview and framework for data and information quality research. ACM J. Data Inf. Data Qual. 1, 1--22. Google ScholarDigital Library
- Makridakis, S. and Winkler, R. 1983. Averages of forecasts: Some empirical results. Manag. Sci. 29, 987--996.Google ScholarDigital Library
- Mandel, M. and Engardio, P. 2007. The real cost of off-shoring. BusinessWeek, June 18.Google Scholar
- Mcgrath, M. 2001. How to boost r&d productivity by 50%%? Insights Mag. www.prtm.comGoogle Scholar
- Meredith, J. R. and Mantel, S. J. 2005. Project Management: A Managerial Approach. Wiley, New York.Google Scholar
- Mihm, J., Loch, C. H., and Huchzermeier, A. 2003. Problem-solving oscillations in complex projects. Manag. Sci. 49, 6, 733--750. Google ScholarDigital Library
- Mookerjee, V., Manino, M., and Gilson, R. 1995. Improving the performance stability of inductive expert systems under input noise. Inf. Syst. Res. 6, 328--356.Google ScholarDigital Library
- Morrice, D. J., Anderson, E. G., and Bharadwaj, S. 2004. A simulation study to assess the efficacy of linear control theory models for the coordination of a two-stage customized service supply chain. In Proceedings of the Winter Simulation Conference. R. G. Ingalls, M. D. Rossetti, J. S. Smith, and B. A. Peters, Eds. Google ScholarDigital Library
- Oppenheim, A.,Willsky, A. S., and Young, I. T. 1983. Signals and Systems. Prentice-Hall, Englewood Cliffs, NJ.Google Scholar
- Parker, G. G. and Anderson, E. G. 2003. From buyer to integrator: The transformation of the supply chain manager in the vertically disintegrating firm. Prod. Oper. Manag. 11, 1, 75--91.Google ScholarCross Ref
- Pmi-Pmbok. 2008. A Guide to the Project Management Body of Knowledge 4th Ed. Project Management Institute Communications.Google Scholar
- Pipino, L. L., Lee, Y. W., and Wang, R. Y. 2002. Data quality assessment. Comm. ACM, 45, 4, 211--218. Google ScholarDigital Library
- Redman, T. C. 1996. Data Quality for the Information Age. Artech House, Boston, MA. Google ScholarDigital Library
- Sahoo, N., Krishnan, R., Duncan, G., and Callan, J. 2011. The halo effect in multi-component ratings and its implications for recommender systems: The case of yahoo! movies. Inf. Syst. Res. 23, 1, 231--246. Google ScholarDigital Library
- Sethi, S. P. and Thompson, G. L. 2000. Optimal Control Theory: Applications to Management Science and Economics. Kluwer Academic.Google Scholar
- Shankaranarayanan, G., Ziad, M., and Wang, R. Y. 2003. Managing data quality in dynamic decision environment: An information product approach. J. Datab. Manag. 14, 14--32.Google ScholarCross Ref
- Smith, R. and Eppinger, S. D. 1997. Identifying controlling features of engineering design iteration. Manag. Sci. 43, 3, 276--293. Google ScholarDigital Library
- Stengel, R. F. 1994. Optimal Control and Estimation. Dover, New York.Google Scholar
- Sterman, J. D. 1989. Modeling managerial behavior: Misperceptions of feedback in a dynamic decision making experiment. Manag. Sci. 35, 321--339. Google ScholarDigital Library
- Tirole, J. 1993. The Theory of Industrial Organization. MIT Press, Cambridge, MA.Google Scholar
- Trigg, D. W. and Leach, A. G. 1967. Exponential smoothing with an adaptive response rate. Oper. Res. Quart. 18, 1, 53--59.Google ScholarCross Ref
- Upton, D. M. and Staats, B. R. 2006. Lean at wipro technologies. Case study 9-607-032, Harvard Business School.Google Scholar
- Ulrich, K. T. and Eppinger, S. D. 2000. Product Design and Development. McGraw-Hill, New York.Google Scholar
- Wang, R. Y. 1998. A product perspective on total quality management. Comm. ACM 41, 2, 58--65. Google ScholarDigital Library
- Wang, R. Y. and Strong, D. M. 1996. Beyond accuracy: What data quality means to data consumers. J. Manag. Inf. Syst. 12, 4, 5--34. Google ScholarDigital Library
- Yassine, A., Joglekar, N. R., Braha, D., Eppinger, S. D., and Whitney, D. 2003. Information hiding in product development: The design churn effect. Res. Engin. Des. 14, 145--161.Google ScholarCross Ref
Index Terms
- Accuracy of aggregate data in distributed project settings: Model, analysis and implications
Recommendations
An Enhanced Technique to Clean Data in the Data Warehouse
DESE '11: Proceedings of the 2011 Developments in E-systems EngineeringData quality is a critical factor for the success of data warehousing projects. Improving the quality of data is important in data warehouse, because it is used in the process of decision support, which requires accurate data. There are many errors and ...
A Review on Data Cleansing Methods for Big Data
AbstractMassive amounts of data are available for the organization which will influence their business decision. Data collected from the various resources are dirty and this will affect the accuracy of prediction result. Data cleansing offers a better ...
A Taxonomy of Dirty Data
Today large corporations are constructing enterprise data warehouses from disparate data sources in order to run enterprise-wide data analysis applications, including decision support systems, multidimensional online analytical applications, data mining,...
Comments