MultiAgentDecisionProcess
|
GeneralizedMAAStarPlanner is a class that represents the Generalized MAA* planner class. More...
#include <GeneralizedMAAStarPlanner.h>
Public Member Functions | |
GeneralizedMAAStarPlanner (int verbose_level=0, double slack=0.0) | |
(default) Constructor More... | |
double | GetExpectedReward () const |
boost::shared_ptr< JointPolicy > | GetJointPolicy () |
boost::shared_ptr < JointPolicyDiscrete > | GetJointPolicyDiscrete () |
boost::shared_ptr < JointPolicyDiscretePure > | GetJointPolicyDiscretePure () |
LIndex | GetMaxJPolPoolSize () const |
double | GetMaxLowerBound () const |
LIndex | GetNrEvaluatedJPolBGs () const |
GeneralizedMAAStarPlanner & | operator= (const GeneralizedMAAStarPlanner &o) |
Copy assignment operator. More... | |
void | Plan () |
void | SetDeadline (size_t deadlineInS) |
void | SetIntermediateResultFile (std::ofstream &of) |
void | SetIntermediateTimingFilename (const std::string &filename) |
void | SetSaveAllBGs (const std::string &filename) |
void | SetVerbose (int verbose) |
~GeneralizedMAAStarPlanner () | |
Destructor. More... | |
Public Member Functions inherited from TimedAlgorithm | |
void | AddTimedEvent (const std::string &id, clock_t duration) |
Adds event of certain duration, e.g., an external program call. More... | |
std::vector< double > | GetTimedEventDurations (const std::string &id) |
Returns all stored durations (in s) for a particular event. More... | |
void | LoadTimers (const std::string &filename) |
Load timing info from file filename. More... | |
void | PrintTimers () const |
Print stored timing info. More... | |
void | PrintTimersSummary () const |
Sums data and prints out a summary. More... | |
void | SaveTimers (const std::string &filename) const |
Save collected timing info to file filename. More... | |
void | SaveTimers (std::ofstream &of) const |
Save collected timing info to ofstream of. More... | |
void | StartTimer (const std::string &id) const |
Start to time an event identified by id. More... | |
void | StopTimer (const std::string &id) const |
Stop to time an event identified by id. More... | |
TimedAlgorithm () | |
(default) Constructor More... | |
virtual | ~TimedAlgorithm () |
Destructor. More... | |
Protected Member Functions | |
virtual bool | ConstructAndValuateNextPolicies (const boost::shared_ptr< PartialPolicyPoolItemInterface > &ppi, const boost::shared_ptr< PartialPolicyPoolInterface > &poolOfNextPolicies, bool &cleanUpPPI)=0 |
The 'NEXT' function as described in refGMAA. More... | |
virtual boost::shared_ptr < PartialJointPolicyDiscretePure > | ConstructExtendedJointPolicy (const PartialJointPolicyDiscretePure &jpolPrevTs, const JointPolicyDiscretePure &jpolBG, const std::vector< size_t > &nrOHts, const std::vector< Index > &firstOHtsI)=0 |
Extends a previous policy jpolPrevTs to the next stage. More... | |
virtual Interface_ProblemToPolicyDiscretePure * | GetThisFromMostDerivedPU ()=0 |
every derived class must implement this function as follows: More... | |
virtual boost::shared_ptr < PartialJointPolicyDiscretePure > | NewJPol () const =0 |
return a new policy. More... | |
virtual boost::shared_ptr < PartialPolicyPoolInterface > | NewPP () const =0 |
return a new policy pool. More... | |
virtual boost::shared_ptr < PartialPolicyPoolItemInterface > | NewPPI (const boost::shared_ptr< PartialJointPolicyDiscretePure > &p, double v) const =0 |
return a new policy pool item. More... | |
void | Prune (PartialPolicyPoolInterface &JPVs, size_t k) |
virtual void | ResetPlanner ()=0 |
This should reset the planner, so it can be started from the beginning. More... | |
void | SelectKBestPoliciesToProcessFurther (const boost::shared_ptr< PartialPolicyPoolInterface > &poolOfNextPolicies, bool are_LBs, double bestLB, size_t k) |
Returns the k best-ranked (partial) joint policies. More... | |
virtual void | SelectPoliciesToProcessFurther (const boost::shared_ptr< PartialPolicyPoolInterface > &poolOfNextPolicies, bool are_LBs, double bestLB)=0 |
Limits the policies to be further examined. More... | |
void | SetCBGbounds (const boost::shared_ptr< PartialPolicyPoolItemInterface > &ppi, const boost::shared_ptr< BayesianGameIdenticalPayoffSolver > &bgips, bool is_last_ts, double discount) |
Protected Attributes | |
std::string | _m_bgBaseFilename |
size_t | _m_bgCounter |
std::vector< size_t > | _m_expanded_childs |
_m_expanded_childs[t] contains the number of child nodes that were expanded 'at stage t'. More... | |
std::ofstream * | _m_intermediateResultFile |
Pointer to an file stream to store the intermediate (timing) results. More... | |
std::string | _m_intermediateTimingFilename |
std::vector< LIndex > | _m_max_expanded_childs |
LIndex | _m_nrJPolBGsEvaluated |
size_t | _m_nrPoliciesToProcess |
std::vector < BayesianGameForDecPOMDPStage * > | _m_pointersToAllBGTS |
bool | _m_saveIntermediateTiming |
double | _m_slack |
when the heuristic is not admissible, or the past reward is an approximation, we may add some slack such that good policies are not pruned More... | |
bool | _m_useSparseBeliefs |
int | _m_verboseness |
the level of verboseness, default=0, >0 verbose, <0 silent More... | |
Private Member Functions | |
void | Initialize () |
Initialize the planner. More... | |
Private Attributes | |
size_t | _m_deadline |
double | _m_expectedRewardFoundPolicy |
the expected reward of the best found policy More... | |
boost::shared_ptr < JointPolicyDiscretePure > | _m_foundPolicy |
the best found policy More... | |
LIndex | _m_maxJPolPoolSize |
a counter that maintains the maximum size of the policy pool during the planning process. More... | |
double | _m_maxLowerBound |
The highest lowerbound found so far. More... | |
GeneralizedMAAStarPlanner is a class that represents the Generalized MAA* planner class.
This implements GMAA pretty much as described in refGMAA (see DOC-References.h). The 'NEXT' as described in refGMAA, is called 'ConstructAndValuateNextPolicies'.
Additionally there is a function 'SelectPoliciesToProcessFurther'. (not to be confused with the 'SELECT' function from refGMAA !!!) Given the result of ConstructAndValuateNextPolicies, SelectPoliciesToProcessFurther determines which of these will actually be added to the policy pool. I.e., ConstructAndValuateNextPolicies and SelectPoliciesToProcessFurther together form 'NEXT'.
The 'SELECT' function as described in refGMAA is implemented by the policy pool (see PartialPolicyPoolInterface) itself.
GeneralizedMAAStarPlanner::GeneralizedMAAStarPlanner | ( | int | verbose_level = 0 , |
double | slack = 0.0 |
||
) |
(default) Constructor
References _m_bgBaseFilename, _m_bgCounter, _m_intermediateResultFile, _m_nrJPolBGsEvaluated, _m_nrPoliciesToProcess, and _m_saveIntermediateTiming.
GeneralizedMAAStarPlanner::~GeneralizedMAAStarPlanner | ( | ) |
Destructor.
|
protectedpure virtual |
The 'NEXT' function as described in refGMAA.
The function that from a given <jpol,val> pair construct a new (ordered by value->priority_queue) set of joint policies. This function should be overriden in derived classes to get different planning behavior.
Implemented in GMAA_kGMAACluster, GMAA_MAAstarCluster, GMAA_MAAstar, GMAA_MAA_ELSI, GMAA_kGMAA, and GMAA_MAAstarClassic.
Referenced by Plan().
|
protectedpure virtual |
Extends a previous policy jpolPrevTs to the next stage.
This function extends a previous policy jpolPrevTs for ts-1 with the behavior specified by the policy of the BayesianGame for time step ts (jpolBG). jpolPrevTs - a joint policy for the DecPOMDP up to time step ts-1 (i.e. with depth=ts-2) jpolBG - a joint policy for the BayesianGame for time step ts. nrOHts - a vector that specifies the number of observation histories for eac agents at time step ts. firstOHtsI - a vector that specifies the index of the first observation history in time step ts for each agent (this functions as the offset in the conversion BG->DecPOMDP index conversion).
returns a new JointPolicyPureVector (so it must be explicitly deleted)
Implemented in GeneralizedMAAStarPlannerForFactoredDecPOMDPDiscrete, and GeneralizedMAAStarPlannerForDecPOMDPDiscrete.
|
inline |
boost::shared_ptr< JointPolicy > GeneralizedMAAStarPlanner::GetJointPolicy | ( | void | ) |
boost::shared_ptr< JointPolicyDiscrete > GeneralizedMAAStarPlanner::GetJointPolicyDiscrete | ( | void | ) |
References _m_foundPolicy.
JPDP_sharedPtr GeneralizedMAAStarPlanner::GetJointPolicyDiscretePure | ( | ) |
References _m_foundPolicy.
|
inline |
|
inline |
Referenced by SetCBGbounds().
|
inline |
References _m_nrJPolBGsEvaluated.
|
protectedpure virtual |
every derived class must implement this function as follows:
GetThisFromMostDerivedPU() { return this; } Giving us access to the Interface_ProblemToPolicyDiscretePure in this base class.
Implemented in GeneralizedMAAStarPlannerForFactoredDecPOMDPDiscrete, GeneralizedMAAStarPlannerForDecPOMDPDiscrete, GMAA_kGMAACluster, and GMAA_MAAstarCluster.
Referenced by Plan().
|
private |
Initialize the planner.
Call the ResetPlanner() of the derived planners.
References _m_bgCounter, _m_expanded_childs, _m_expectedRewardFoundPolicy, _m_foundPolicy, _m_max_expanded_childs, _m_maxJPolPoolSize, _m_maxLowerBound, _m_pointersToAllBGTS, and ResetPlanner().
Referenced by Plan().
|
protectedpure virtual |
return a new policy.
Different versions of GMAA may make use of different implementations of policies. This function must be implemented by a derived class and return a pointer to a newly created PolicyPoolItem object. This way,the derived class can determine the implementation of the policy.
it returns a pointer to a PartialJointPolicyDiscretePure that is created by 'new' therefore, do not forget to 'delete'!!
Implemented in GMAA_kGMAACluster, GMAA_MAAstarCluster, GeneralizedMAAStarPlannerForFactoredDecPOMDPDiscrete, and GeneralizedMAAStarPlannerForDecPOMDPDiscrete.
Referenced by Plan().
|
protectedpure virtual |
return a new policy pool.
this function must be implemented by a derived class and return a pointer to a PartialPolicyPoolInterface object. This way,the derived class can determine the implementation of the policy pools used and thus the 'SELECT' functions as described in refGMAA.
it returns a pointer to a PartialPolicyPoolInterface that is created by 'new' therefore, do not forget to 'delete'!!
Implemented in GeneralizedMAAStarPlannerForFactoredDecPOMDPDiscrete, and GeneralizedMAAStarPlannerForDecPOMDPDiscrete.
Referenced by Plan(), Prune(), and SelectKBestPoliciesToProcessFurther().
|
protectedpure virtual |
return a new policy pool item.
This function must be implemented by a derived class and return a pointer to a newly created PolicyPoolItem object. This way,the derived class can determine the implementation of the policy pools used and thus the 'SELECT' functions as described in refGMAA.
it returns a pointer to a PartialPolicyPoolInterface that is created by 'new' therefore, do not forget to 'delete'!!Overloaded form of NewPP().
Creates a PartialPolicyPoolItemInterface which contains joint policy p, with value v.
Implemented in GMAA_kGMAACluster, GMAA_MAAstarCluster, GMAA_MAAstar, GMAA_kGMAA, GMAA_MAAstarClassic, GeneralizedMAAStarPlannerForFactoredDecPOMDPDiscrete, and GeneralizedMAAStarPlannerForDecPOMDPDiscrete.
GeneralizedMAAStarPlanner & GeneralizedMAAStarPlanner::operator= | ( | const GeneralizedMAAStarPlanner & | o | ) |
Copy assignment operator.
void GeneralizedMAAStarPlanner::Plan | ( | ) |
References _m_deadline, _m_expanded_childs, _m_expectedRewardFoundPolicy, _m_foundPolicy, _m_intermediateResultFile, _m_intermediateTimingFilename, _m_max_expanded_childs, _m_maxJPolPoolSize, _m_maxLowerBound, _m_saveIntermediateTiming, _m_slack, _m_verboseness, ConstructAndValuateNextPolicies(), DEBUG_GMAA4, GetThisFromMostDerivedPU(), Initialize(), NewJPol(), NewPP(), TimedAlgorithm::SaveTimers(), SelectPoliciesToProcessFurther(), PrintTools::SoftPrintVector(), TimedAlgorithm::StartTimer(), and TimedAlgorithm::StopTimer().
Referenced by GeneralizedMAAStarPlannerForDecPOMDPDiscrete::Plan(), and GeneralizedMAAStarPlannerForFactoredDecPOMDPDiscrete::Plan().
|
protected |
|
protectedpure virtual |
This should reset the planner, so it can be started from the beginning.
Implemented in GMAA_kGMAACluster, GMAA_MAAstarCluster, GMAA_MAAstar, GMAA_kGMAA, GMAA_MAAstarClassic, and GMAA_MAA_ELSI.
Referenced by Initialize().
|
protected |
Returns the k best-ranked (partial) joint policies.
An implementation of the 'SELECT' function, that returns the k (partial) joint policies with the highest heuristic values.
References NewPP().
Referenced by GMAA_kGMAA::SelectPoliciesToProcessFurther(), and GMAA_kGMAACluster::SelectPoliciesToProcessFurther().
|
protectedpure virtual |
Limits the policies to be further examined.
Of the <jpol,val> pairs found by ConstructAndValuateNextPolicies, we may not want to process all of them further. This function performs a selection. This function should be overriden in derived classes to get different planning behavior.
Implemented in GMAA_kGMAACluster, GMAA_MAAstarCluster, GMAA_MAAstar, GMAA_kGMAA, GMAA_MAAstarClassic, and GMAA_MAA_ELSI.
Referenced by Plan().
|
inlineprotected |
void GeneralizedMAAStarPlanner::SetDeadline | ( | size_t | deadlineInS | ) |
References _m_deadline, and _m_verboseness.
|
inline |
void GeneralizedMAAStarPlanner::SetIntermediateTimingFilename | ( | const std::string & | filename | ) |
References _m_intermediateTimingFilename, and _m_saveIntermediateTiming.
|
inline |
void GeneralizedMAAStarPlanner::SetVerbose | ( | int | verbose | ) |
References _m_verboseness.
|
protected |
Referenced by GMAA_MAA_ELSI::CAVNP_quick_n_dirty2(), GMAA_MAAstarClassic::ConstructAndValuateNextPolicies(), GMAA_kGMAA::ConstructAndValuateNextPolicies(), GMAA_MAAstar::ConstructAndValuateNextPolicies(), GMAA_MAAstarCluster::ConstructAndValuateNextPolicies(), GMAA_MAA_ELSI::ConstructBayesianGame(), and GeneralizedMAAStarPlanner().
|
protected |
Referenced by GMAA_MAA_ELSI::CAVNP_quick_n_dirty2(), GMAA_MAAstarClassic::ConstructAndValuateNextPolicies(), GMAA_kGMAA::ConstructAndValuateNextPolicies(), GMAA_MAAstar::ConstructAndValuateNextPolicies(), GMAA_MAAstarCluster::ConstructAndValuateNextPolicies(), GMAA_MAA_ELSI::ConstructBayesianGame(), GeneralizedMAAStarPlanner(), and Initialize().
|
private |
Referenced by Plan(), and SetDeadline().
|
protected |
_m_expanded_childs[t] contains the number of child nodes that were expanded 'at stage t'.
That means
selected parent of depth t = ^t -> child has depth t+1 = (varphi^t, ^t) (so the child nodes are depth t+1 !)
Referenced by Initialize(), and Plan().
|
private |
the expected reward of the best found policy
Referenced by Initialize(), and Plan().
|
private |
the best found policy
Referenced by GetJointPolicy(), GetJointPolicyDiscrete(), GetJointPolicyDiscretePure(), Initialize(), and Plan().
|
protected |
Pointer to an file stream to store the intermediate (timing) results.
Referenced by GeneralizedMAAStarPlanner(), and Plan().
|
protected |
Referenced by Plan(), and SetIntermediateTimingFilename().
|
protected |
Referenced by Initialize(), and Plan().
|
private |
a counter that maintains the maximum size of the policy pool during the planning process.
Referenced by Initialize(), and Plan().
|
private |
The highest lowerbound found so far.
Referenced by Initialize(), and Plan().
|
protected |
|
protected |
Referenced by GMAA_kGMAA::ConstructAndValuateNextPolicies(), GMAA_kGMAACluster::ConstructAndValuateNextPolicies(), GeneralizedMAAStarPlanner(), GMAA_kGMAA::GMAA_kGMAA(), GMAA_kGMAACluster::GMAA_kGMAACluster(), GMAA_kGMAA::SelectPoliciesToProcessFurther(), and GMAA_kGMAACluster::SelectPoliciesToProcessFurther().
|
protected |
Referenced by Initialize().
|
protected |
Referenced by GeneralizedMAAStarPlanner(), Plan(), and SetIntermediateTimingFilename().
|
protected |
when the heuristic is not admissible, or the past reward is an approximation, we may add some slack such that good policies are not pruned
Referenced by Plan().
|
protected |
|
protected |
the level of verboseness, default=0, >0 verbose, <0 silent
Referenced by GMAA_MAA_ELSI::CAVNP_quick_n_dirty2(), GMAA_MAAstarClassic::ConstructAndValuateNextPolicies(), GMAA_MAAstar::ConstructAndValuateNextPolicies(), GMAA_MAAstarCluster::ConstructAndValuateNextPolicies(), GMAA_kGMAACluster::ConstructAndValuateNextPolicies(), GMAA_MAA_ELSI::ConstructAndValuateNextPoliciesExactBG(), GMAA_MAA_ELSI::ConstructBayesianGame(), Plan(), SetDeadline(), and SetVerbose().