ORIE 5741

ORIE 5741

Course information provided by the 2025-2026 Catalog.

Modern data sets, whether collected by scientists, engineers, medical researchers, government, financial firms, social networks, or software companies, are often big, messy, and extremely useful. This course addresses scalable robust methods for learning from big messy data. We'll cover techniques for learning with data that is messy --- consisting of real numbers, integers, booleans, categoricals, ordinals, graphs, text, sets, and more, with missing entries and with outliers --- and that is big --- which means we can only use algorithms whose complexity scales linearly in the size of the data. We will cover techniques for cleaning data, supervised and unsupervised learning, finding similar items, model validation, and feature engineering.


Prerequisites MATH 2940, ENGRD 2700, ENGRD 2110/CS 2110, CS 2800 or equivalents.

Forbidden Overlaps CS 3780, CS 5780, ECE 3200, ECE 5420, ORIE 3741, ORIE 5741, STSCI 3740, STSCI 5740

Last 4 Terms Offered 2025SP, 2024SP, 2023SP, 2021FA

View Enrollment Information

Syllabi: none
  •   Regular Academic Session.  Choose one lecture and one discussion. Combined with: ORIE 3741

  • 4 Credits GradeNoAud

  •  5411 ORIE 5741   LEC 001

    • TR
    • Jan 20 - May 5, 2026
    • Shafiee, S

  • Instruction Mode: In Person

    Enrollment limited to: Operations Research and Information Engineering (ORIE) Master of Engineering (M.Eng.) students during pre-enroll, others may enroll during add/drop.

  •  5412 ORIE 5741   DIS 201

    • M
    • Jan 20 - May 5, 2026
    • Shafiee, S

  • Instruction Mode: In Person

  •  5413 ORIE 5741   DIS 202

    • T
    • Jan 20 - May 5, 2026
    • Shafiee, S

  • Instruction Mode: In Person

  •  5414 ORIE 5741   DIS 203

    • W
    • Jan 20 - May 5, 2026
    • Shafiee, S

  • Instruction Mode: In Person

  •  5415 ORIE 5741   DIS 204

    • W
    • Jan 20 - May 5, 2026
    • Shafiee, S

  • Instruction Mode: In Person