Nonlinear Regression Modeling for Engineering Applications: Modeling, Model Validation, and Enabling Design of Experiments
Rhinehart, R. Russell
Since mathematical models express our understanding of how nature behaves, we use them to validate our understanding of the fundamentals about systems (which could be processes, equipment, procedures, devices, or products). Also, when validated, the model is useful for engineering applications related to diagnosis, design, and optimization. First, we postulate a mechanism, then derive a model grounded in that mechanistic understanding. If the model does not fit the data, our understanding of the mechanism was wrong or incomplete. Patterns in the residuals can guide model improvement. Alternately, when the model fits the data, our understanding is sufficient and confidently functional for engineering applications. This book details methods of nonlinear regression, computational algorithms,model validation, interpretation of residuals, and useful experimental design. The focus is on practical applications, with relevant methods supported by fundamental analysis. This book will assist either the academic or industrial practitioner to properly classify the system, choose between the various available modeling options and regression objectives, design experiments to obtain data capturing critical system behaviors, fit the model parameters based on that data, and statistically characterize the resulting model. The author has used the material in the undergraduate unit operations lab course and in advanced control applications. INDICE: Series Preface xiii .Preface xv .Acknowledgments xxiii .Nomenclature xxv .Symbols xxxvii .Part I INTRODUCTION .1 Introductory Concepts 3 .1.1 Illustrative Example Traditional Linear Least–Squares Regression 3 .1.2 How Models Are Used 7 .1.3 Nonlinear Regression 7 .1.4 Variable Types 8 .1.5 Simulation 12 .1.6 Issues 13 .1.7 Takeaway 15 .Exercises 15 .2 Model Types 16 .2.1 Model Terminology 16 .2.2 A Classification of Mathematical Model Types 17 .2.3 Steady–State and Dynamic Models 21 .2.3.1 Steady–State Models 22 .2.3.2 Dynamic Models (Time–Dependent, Transient) 24 .2.4 Pseudo–First Principles Appropriated First Principles 26 .2.5 Pseudo–First Principles Pseudo–Components 28 .2.6 Empirical Models with Theoretical Grounding 28 .2.6.1 Empirical Steady State 28 .2.6.2 Empirical Time–Dependent 30 .2.7 Empirical Models with No Theoretical Grounding 31 .2.8 Partitioned Models 31 .2.9 Empirical or Phenomenological? 32 .2.10 Ensemble Models 32 .2.11 Simulators 33 .2.12 Stochastic and Probabilistic Models 33 .2.13 Linearity 34 .2.14 Discrete or Continuous 36 .2.15 Constraints 36 .2.16 Model Design (Architecture, Functionality, Structure) 37 .2.17 Takeaway 37 .Exercises 37 .Part II PREPARATION FOR UNDERLYING SKILLS .3 Propagation of Uncertainty 43 .3.1 Introduction 43 .3.2 Sources of Error and Uncertainty 44 .3.2.1 Estimation 45 .3.2.2 Discrimination 45 .3.2.3 Calibration Drift 45 .3.2.4 Accuracy 45 .3.2.5 Technique 46 .3.2.6 Constants and Data 46 .3.2.7 Noise 46 .3.2.8 Model and Equations 46 .3.2.9 Humans 47 .3.3 Significant Digits 47 .3.4 Rounding Off 48 .3.5 Estimating Uncertainty on Values 49 .3.5.1 Caution 50 .3.6 Propagation of Uncertainty Overview Two Types, Two Ways Each 51 .3.6.1 Maximum Uncertainty 51 .3.6.2 Probable Uncertainty 56 .3.6.3 Generality 58 .3.7 Which to Report? Maximum or Probable Uncertainty 59 .3.8 Bootstrapping 59 .3.9 Bias and Precision 61 .3.10 Takeaway 65 .Exercises 66 .4 Essential Probability and Statistics 67 .4.1 Variation and Its Role in Topics 67 .4.2 Histogram and Its PDF and CDF Views 67 .4.3 Constructing a Data–Based View of PDF and CDF 70 .4.4 Parameters that Characterize the Distribution 71 .4.5 Some Representative Distributions 72 .4.5.1 Gaussian Distribution 72 .4.5.2 Log–Normal Distribution 72 .4.5.3 Logistic Distribution 74 .4.5.4 Exponential Distribution 74 .4.5.5 Binomial Distribution 75 .4.6 Confidence Interval 76 .4.7 Central Limit Theorem 77 .4.8 Hypothesis and Testing 78 .4.9 Type I and Type II Errors, Alpha and Beta 80 .4.10 Essential Statistics for This Text 82 .4.10.1 t–Test for Bias 83 .4.10.2 Wilcoxon Signed Rank Test for Bias 83 .4.10.3 r–lag–1 Autocorrelation Test 84 .4.10.4 Runs Test 87 .4.10.5 Test for Steady State in a Noisy Signal 87 .4.10.6 Chi–Square Contingency Test 89 .4.10.7 Kolmogorov Smirnov Distribution Test 89 .4.10.8 Test for Proportion 90 .4.10.9 F–Test for Equal Variance 90 .4.11 Takeaway 91 .Exercises 91 .5 Simulation 93 .5.1 Introduction 93 .5.2 Three Sources of Deviation: Measurement, Inputs, Coefficients 93 .5.3 Two Types of Perturbations: Noise (Independent) and Drifts (Persistence) 95 .5.4 Two Types of Influence: Additive and Scaled with Level 98 .5.5 Using the Inverse CDF to Generate n and u from UID(0, 1) 99 .5.6 Takeaway 100 .Exercises 100 .6 Steady and Transient State Detection 101 .6.1 Introduction 101 .6.1.1 General Applications 101 .6.1.2 Concepts and Issues in Detecting Steady State 104 .6.1.3 Approaches and Issues to SSID and TSID 104 .6.2 Method 106 .6.2.1 Conceptual Model 106 .6.2.2 Equations 107 .6.2.3 Coefficient, Threshold, and Sample Frequency Values 108 .6.2.4 Noiseless Data 111 .6.3 Applications 112 .6.3.1 Applications of the R–Statistic Approach for Process Monitoring 112 .6.3.2 Applications of the R–Statistic Approach for Determining Regression Convergence 112 .6.4 Takeaway 114 .Exercises 114 .Part III REGRESSION, VALIDATION, DESIGN .7 Regression Target Objective Function 119 .7.1 Introduction 119 .7.2 Experimental and Measurement Uncertainty Static and Continuous Valued 119 .7.3 Likelihood 122 .7.4 Maximum Likelihood 124 .7.5 Estimating x and y Values 127 .7.6 Vertical SSD A Limiting Consideration of Variability Only in the Response Measurement 127 .7.7 r–Square as a Measure of Fit 128 .7.8 Normal, Total, or Perpendicular SSD 130 .7.9 Akaho s Method 132 .7.10 Using a Model Inverse for Regression 134 .7.11 Choosing the Dependent Variable 135 .7.12 Model Prediction with Dynamic Models 136 .7.13 Model Prediction with Classification Models 137 .7.14 Model Prediction with Rank Models 138 .7.15 Probabilistic Models 139 .7.16 Stochastic Models 139 .7.17 Takeaway 139 .Exercises 140 .8 Constraints 141 .8.1 Introduction 141 .8.2 Constraint Types 141 .8.3 Expressing Hard Constraints in the Optimization Statement 142 .8.4 Expressing Soft Constraints in the Optimization Statement 143 .8.5 Equality Constraints 147 .8.6 Takeaway 148 .Exercises 148 .9 The Distortion of Linearizing Transforms 149 .9.1 Linearizing Coefficient Expression in Nonlinear Functions 149 .9.2 The Associated Distortion 151 .9.3 Sequential Coefficient Evaluation 154 .9.4 Takeaway 155 .Exercises 155 .10 Optimization Algorithms 157 .10.1 Introduction 157 .10.2 Optimization Concepts 157 .10.3 Gradient–Based Optimization 159 .10.3.1 Numerical Derivative Evaluation 159 .10.3.2 Steepest Descent The Gradient 161 .10.3.3 Cauchy s Method 162 .10.3.4 Incremental Steepest Descent (ISD) 163 .10.3.5 Newton Raphson (NR) 163 .10.3.6 Levenberg Marquardt (LM) 165 .10.3.7 Modified LM 166 .10.3.8 Generalized Reduced Gradient (GRG) 167 .10.3.9 Work Assessment 167 .10.3.10 Successive Quadratic (SQ) 167 .10.3.11 Perspective 168 .10.4 Direct Search Optimizers 168 .10.4.1 Cyclic Heuristic Direct Search 169 .10.4.2 Multiplayer Direct Search Algorithms 170 .10.4.3 Leapfrogging 171 .10.5 Takeaway 173 .11 Multiple Optima 176 .11.1 Introduction 176 .11.2 Quantifying the Probability of Finding the Global Best 178 .11.3 Approaches to Find the Global Optimum 179 .11.4 Best–of–N Rule for Regression Starts 180 .11.5 Interpreting the CDF 182 .11.6 Takeaway 184 .12 Regression Convergence Criteria 185 .12.1 Introduction 185 .12.2 Convergence versus Stopping 185 .12.3 Traditional Criteria for Claiming Convergence 186 .12.4 Combining DV Influence on OF 188 .12.5 Use Relative Impact as Convergence Criterion 189 .12.6 Steady–State Convergence Criterion 190 .12.7 Neural Network Validation 197 .12.8 Takeaway 198 .Exercises 198 .13 Model Design Desired and Undesired Model Characteristics and Effects 199 .13.1 Introduction 199 .13.2 Redundant Coefficients 199 .13.3 Coefficient Correlation 201 .13.4 Asymptotic and Uncertainty Effects When Model is Inverted 203 .13.5 Irrelevant Coefficients 205 .13.6 Poles and Sign Flips w.r.t. the DV 206 .13.7 Too Many Adjustable Coefficients or Too Many Regressors 206 .13.8 Irrelevant Model Coefficients 215 .13.8.1 Standard Error of the Estimate 216 .13.8.2 Backward Elimination 216 .13.8.3 Logical Tests 216 .13.8.4 Propagation of Uncertainty 216 .13.8.5 Bootstrapping 217 .13.9 Scale–Up or Scale–Down Transition to New Phenomena 217 .13.10 Takeaway 218 .Exercises 218 .14 Data Pre– and Post–processing 220 .14.1 Introduction 220 .14.2 Pre–processing Techniques 221 .14.2.1 Steady– and Transient–State Selection 221 .14.2.2 Internal Consistency 221 .14.2.3 Truncation 222 .14.2.4 Averaging and Voting 222 .14.2.5 Data Reconciliation 223 .14.2.6 Real–Time Noise Filtering for Noise Reduction (MA, FoF, STF) 224 .14.2.7 Real–Time Noise filtering for Outlier Removal (Median Filter) 227 .14.2.8 Real–Time Noise Filtering, Statistical Process Control 228 .14.2.9 Imputation of Input Data 230 .14.3 Post–processing 231 .14.3.1 Outliers and Rejection Criterion 231 .14.3.2 Bimodal Residual Distributions 233 .14.3.3 Imputation of Response Data 235 .14.4 Takeaway 235 .Exercises 235 .15 Incremental Model Adjustment 237 .15.1 Introduction 237 .15.2 Choosing the Adjustable Coefficient in Phenomenological Models 238 .15.3 Simple Approach 238 .15.4 An Alternate Approach 240 .15.5 Other Approaches 241 .15.6 Takeaway 241 .Exercises 241 .16 Model and Experimental Validation 242 .16.1 Introduction 242 .16.1.1 Concepts 242 .16.1.2 Deterministic Models 244 .16.1.3 Stochastic Models 246 .16.1.4 Reality! 249 .16.2 Logic–Based Validation Criteria 250 .16.3 Data–Based Validation Criteria and Statistical Tests 251 .16.3.1 Continuous–Valued, Deterministic, Steady State, or End–of–Batch 251 .16.3.2 Continuous–Valued, Deterministic, Transient 263 .16.3.3 Class/Discrete/Rank–Valued, Deterministic, Batch, or Steady State 264 .16.3.4 Continuous–Valued, Stochastic, Batch, or Steady State 265 .16.3.5 Test for Normally Distributed Residuals 266 .16.3.6 Experimental Procedure Validation 266 .16.4 Model Discrimination 267 .16.4.1 Mechanistic Models 267 .16.4.2 Purely Empirical Models 268 .16.5 Procedure Summary 268 .16.6 Alternate Validation Approaches 269 .16.7 Takeaway 270 .Exercises 270 .17 Model Prediction Uncertainty 272 .17.1 Introduction 272 .17.2 Bootstrapping 273 .17.3 Takeaway 276 .18 Design of Experiments for Model Development and Validation 277 .18.1 Concept Plan and Data 277 .18.2 Sufficiently Small Experimental Uncertainty Methodology 277 .18.3 Screening Designs A Good Plan for an Alternate Purpose 281 .18.4 Experimental Design A Plan for Validation and Discrimination 282 .18.4.1 Continually Redesign 282 .18.4.2 Experimental Plan 283 .18.5 EHS&LP 286 .18.6 Visual Examples of Undesired Designs 287 .18.7 Example for an Experimental Plan 289 .18.8 Takeaway 291 .Exercises 292 .19 Utility versus Perfection 293 .19.1 Competing and Conflicting Measures of Excellence 293 .19.2 Attributes for Model Utility Evaluation 294 .19.3 Takeaway 295 .Exercises 296 .20 Troubleshooting 297 .20.1 Introduction 297 .20.2 Bimodal and Multimodal Residuals 297 .20.3 Trends in the Residuals 298 .20.4 Parameter Correlation 298 .20.5 Convergence Criterion Too Tight, Too Loose 299 .20.6 Overfitting (Memorization) 300 .20.7 Solution Procedure Encounters Execution Errors 300 .20.8 Not a Sharp CDF (OF) 300 .20.9 Outliers 301 .20.10 Average Residual Not Zero 302 .20.11 Irrelevant Model Coefficients 302 .20.12 Data Work–Up after the Trials 302 .20.13 Too Many rs! 303 .20.14 Propagation of Uncertainty Does Not Match Residuals 303 .20.15 Multiple Optima 304 .20.16 Very Slow Progress 304 .20.17 All Residuals are Zero 304 .20.18 Takeaway 305 .Exercises 305 .Part IV CASE STUDIES AND DATA .21 Case Studies 309 .21.1 Valve Characterization 309 .21.2 CO2 Orifice Calibration 311 .21.3 Enrollment Trend 312 .21.4 Algae Response to Sunlight Intensity 314 .21.5 Batch Reaction Kinetics 316 .Appendix A: VBA Primer: Brief on VBA Programming Excel in Office 2013 319 .Appendix B: Leapfrogging Optimizer Code for Steady–State Models 328 .Appendix C: Bootstrapping with Static Model 341 .References and Further Reading 350 .Index 355
- ISBN: 978-1-118-59796-5
- Editorial: Wiley–Blackwell
- Encuadernacion: Cartoné
- Páginas: 400
- Fecha Publicación: 30/09/2016
- Nº Volúmenes: 1
- Idioma: Inglés