Learning Policies and Heuristics for Bidirectional Search

dc.contributor.advisorLelis, Levi (Computing Science)
dc.contributor.authorTjhia, Kenneth G
dc.date.accessioned2025-05-28T23:06:09Z
dc.date.available2025-05-28T23:06:09Z
dc.date.issued2024-11
dc.description.abstractThis thesis empirically investigates the comparative ease of learning policies and heuristics for bidirectional versus unidirectional search in satisficing classical planning. Our research explores the potential advantages of bidirectional search in terms of learnability and efficiency of the resulting guidance mechanisms. We employ a learning framework where neural networks parameterize policies and heuristics, trained on solutions generated by bidirectional and unidirectional search methods across three classical planning domains: Sliding Tile Puzzle, Witness Puzzles (Triangles and Colours), and Pancake Puzzle. We compare base algorithms that span a spectrum from pure heuristic-based to pure policy-based approaches. As models improve, they solve more problems and find different solutions, potentially enhancing the quality of training data. Experimental results suggest that learning satisficing policies and heuristics for bidirectional search is often easier and leads to more efficient search guidance. Consistently, during training, bidirectional searches achieved greater or equal solve rates after seeing fewer problems and using fewer cumulative expansions. The resulting learned bidirectional guidance mechanisms were efficient in terms of expansions and generalized well to test sets. These findings were particularly pronounced in the Witness and Pancake domains. We propose several possible explanations for these observations, including the potential for bidirectional search to yield different solution distributions, allow for complementary interaction between the two directions predictions, and reduce the effective search depth. This research contributes to the ongoing exploration of learning in planning, raising new questions about the relationship between search strategies and the ease of learning effective guidance.
dc.identifier.doihttps://doi.org/10.7939/r3-3hft-x996
dc.language.isoen
dc.rightsThis thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
dc.subjectsearch
dc.subjectclassical planning
dc.subjectplanning
dc.subjectbidirectional search
dc.subjectlearning
dc.subjectheuristic search
dc.titleLearning Policies and Heuristics for Bidirectional Search
dc.typehttp://purl.org/coar/resource_type/c_46ec
thesis.degree.disciplineStatistical Machine Learning
thesis.degree.grantorhttp://id.loc.gov/authorities/names/n79058482
thesis.degree.levelMaster's
thesis.degree.nameMaster of Science
ual.date.graduationFall 2024
ual.departmentDepartment of Computing Science
ual.jupiterAccesshttp://terms.library.ualberta.ca/public

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Tjhia_Kenneth_G_202409_MSc.pdf
Size:
13.4 MB
Format:
Adobe Portable Document Format