1. Trang chủ
  2. » Giáo Dục - Đào Tạo

(LUẬN văn THẠC sĩ) APPROCHES COLLECTIVES POUR LE PROBLEME DE LA PATROUILLE MULTI AGENTS

47 4 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Approches Collectives Pour Le Problème De La Patrouille Multi-Agents
Tác giả Chu Hoang Nam
Người hướng dẫn M. Olivier Simonin – Maître de Conférences, M. François Charpillet – Directeur de Recherche
Trường học Institut de la Francophonie pour l’Informatique
Chuyên ngành Intelligence Artificielle et Multimédia
Thể loại mémoire
Năm xuất bản 2007
Thành phố Vandœuvre-lès-Nancy
Định dạng
Số trang 47
Dung lượng 1,04 MB

Cấu trúc

  • 1.1. C RITÈRES D ’ ÉVALUATION (7)
  • 1.2. E NVIRONNEMENT (8)
  • 1.3. T RAVAUX ANTÉRIEURS (10)
  • 2.1. I NTELLIGENCE COLLECTIVE (S WARM I NTELLIGENCE ) (11)
  • 2.2. P HÉROMONE DIGITALE (12)
  • 2.3. EVAP : UN MODÈLE BASÉ SUR L ’ ÉVAPORATION DES PHÉROMONES (12)
  • 2.4. CLI N G : UN MODÈLE BASÉ SUR LA PROPAGATION D ’ INFORMATIONS (14)
  • 3.1. S IMULATION ET ANALYSE (17)
  • 3.2. D ISCUSSION (20)
    • 3.2.1. Complexité (20)
    • 3.2.2. Exploration et patrouille (21)
    • 3.2.3. Avantages et défauts des méthodes (21)
  • 4.1. L IMITATION ÉNERGÉTIQUE (22)
  • 4.2. MARKA : UN MODÈLE COLLECTIF BASÉ SUR LA CONSTRUCTION DE CHAMP NUMÉRIQUE POTENTIEL (23)
    • 4.2.1. Comportement des agents (24)
    • 4.2.2. Algorithme (25)
    • 4.2.3. Estimation de l’autosuffisance (26)
  • 4.3. TANKER : UNE APPROCHE AUTO - ORGANISÉE COLLECTIVE POUR L ’ OPTIMISATION DE (26)
    • 4.3.1. Les forces attractives et répulsives (27)
    • 4.3.2. Comportement du modèle (algorithme) (28)
  • 5.1. T ÂCHE SIMPLE (31)
  • 5.2. T ÂCHE DYNAMIQUE (31)
  • 5.3. A VANTAGES ET DÉFAUTS DES MODÈLES (33)

Nội dung

C RITÈRES D ’ ÉVALUATION

Effectively patrolling a potentially dynamic environment requires minimizing the time between visits to the same location Patrol strategies typically assume a known, two-dimensional environment that can be represented as a graph G(V,E), where V is the set of patrol nodes We focus on methods that calculate node idleness, which can be assessed either at the node level or across the entire graph The criteria utilized for this analysis are introduced in reference [8].

• Instantaneous Node Idleness (INI) : nombre de pas de temps ó un nœud est resté non visité, appelé oisiveté dans le reste du présent rapport Critère calculé pour chaque nœud

• Instantaneous Graph Idleness (IGI) : moyenne de l'Instantaneous Idleness de tous les noeuds pour un instant donné Critère calculé au niveau du graphe

• Average Graph Idleness (AvgI) : moyenne de IGI sur n pas de temps Critère calculé au niveau du graphe

Instantaneous Worst Idleness (IWI) refers to the highest level of idleness observed within a specific time frame, which is termed maximum idleness or worst idleness in this report This criterion is calculated at the graph level.

E NVIRONNEMENT

On trouve dans les travaux antérieurs deux types d’environnement utilisés par les modốles de la patrouille multi-agent : espace ô discret ằ et espace ô continu ằ

The discreet space ô is represented as a graph G(V, E), where V denotes the set of nodes to visit and E represents the edges defining valid paths between these nodes This graphical representation is ideal for patrolling between points of interest.

Continuous space refers to an area that needs to be covered, such as a room or a building This type of space can be modeled using a grid where each cell represents either a location to visit or an inaccessible area, such as a wall or obstacle.

Figure 1 : Espace ô discret ằ et espace ô continu ằ

Understanding the environment is crucial for effective patrolling, as it significantly impacts both the selection of the patrolling algorithm and its overall performance.

Agents possess prior knowledge of their environment, making a cognitive architecture suitable for such settings They can operate offline by memorizing maps or planning optimal routes before executing tasks.

The patrol task is carried out without prior knowledge of the environment, necessitating that agents perform dual roles: exploration and patrolling In this context, reactive agents can be employed, which are capable of learning or utilizing environment marking techniques.

In this internship, we focus on the challenge of patrolling in an unknown environment, where the graph representing the surroundings is not available The space explored by the agents is depicted as a matrix of cells, with each cell potentially representing different states or conditions.

• Être inaccessible (un obstacle, un mur …)

T RAVAUX ANTÉRIEURS

Recent years have seen the patrol problem addressed through centralized, heuristic, and distributed approaches, all framed within a graph representation of the environment, where nodes represent predetermined locations to visit and edges denote paths connecting these nodes Various studies utilize graph traversal algorithms, often inspired by the traveling salesman problem For instance, one solution involves ant colony optimization (ACO algorithms), which also requires prior knowledge of the environment in graph form Similarly, learning-based techniques seek to determine an optimal multi-agent path calculated offline, meaning the optimal route is established before task execution in the given environment Consequently, such methods lack the flexibility to adapt to online changes, such as alterations in the environment's topology or the addition or loss of agents.

One limitation of these solutions is the combinatorial explosion that occurs when the graph size increases significantly, such as with hundreds of nodes or a growing number of deployed agents Currently, many real-world applications face the challenge of patrolling large areas, whether known or unknown, involving a substantial number of agents, like drones monitoring strategic locations or mobile robots surveilling buildings.

2 Approche par Systèmes multi-agents réactifs

I NTELLIGENCE COLLECTIVE (S WARM I NTELLIGENCE )

Swarm Intelligence (SI), as defined by Bonabeau, refers to a system's ability to generate coherent global patterns through the collective behaviors of simple agents interacting locally with their environment This concept, inspired by animal societies like ant colonies and fish schools, has led to the development of a new paradigm in computation and behavior Specifically, this field draws from the study of social insects, such as ants and termites, and emphasizes self-organization and the emergence of behaviors, contrasting with individual biological systems like Genetic Algorithms.

Insect societies exhibit remarkable biological phenomena, such as ants forming bridges with their bodies to cross large spaces and bees constructing parallel honeycomb structures These colonies face various daily challenges, including food foraging and task allocation among individuals Ethological studies have revealed that certain collective behaviors in social insects are self-organized, leading to the emergence of complex structures from simple interactions, like an ant following a pheromone trail left by another These interactions enable the colony to collectively solve intricate problems, such as finding the shortest path.

Today, the application of collective behavior models from social insects to computational models has provided solutions to complex issues, particularly in route optimization, scheduling, and the traveling salesman problem This approach effectively addresses these challenges by leveraging the natural strategies observed in insect societies.

• Flexible : l’adaptation à de brusques modifications de l’environnement

The robust system is designed to dynamically accommodate the addition or removal of agents, as well as handle potential failures in task execution It possesses the capability to self-organize and adapt to these changes effectively.

• Décentralisé : il n’y a pas un contrôleur central dans le système

P HÉROMONE DIGITALE

Stigmergy, a concept introduced by biologist Pierre-Paul Grassé in 1959, refers to the indirect communication method observed in social insects like termites and ants, where individuals interact by modifying their environment This form of communication is vital in self-organizing emergent systems, allowing for coordination among agents Digital pheromones serve as artificial stigmergies in multi-agent systems, modeling insect societies For example, research has demonstrated coordination models for unmanned vehicles utilizing digital pheromone mechanisms for surveillance and pursuit tasks This indirect communication method is particularly effective for handling tasks in initially unknown environments, such as foraging and coverage exploration.

EVAP : UN MODÈLE BASÉ SUR L ’ ÉVAPORATION DES PHÉROMONES

The proposed EVAP model for patrolling in unknown environments is based on the deposition of pheromones, uniquely utilizing the evaporation process This model marks visited cells with a maximum pheromone quantity, q0, while the remaining amount serves as an indicator of the time elapsed since the last visit, reflecting inactivity Consequently, an agent's behavior is defined by a gradient descent of the pheromone, guiding the agent to move towards cells with the least pheromone concentration.

The evaporation process can be expressed through a geometric sequence, where the quantity of pheromone in a cell at time step n is represented as qn+1 = qn (1 – coefEvap), with coefEvap ranging between 0 and 1, and q0 being greater than 0 This model demonstrates that qn is monotonically decreasing for any value of coefEvap within the specified range, indicating its independence from the choice of coefEvap Consequently, this evaporation process facilitates the generation of a gradient that is oriented according to the chronological order of cell visits.

The gradient descent behavior enables agents to explore both previously visited and unvisited areas Each agent's perception is confined to the four neighboring cells around its current position, referred to as NeighborCells in the algorithms, allowing them to detect the pheromone levels present The agent then moves toward the cell with the lowest pheromone value among the four.

In the model, an agent randomly selects from neighboring cells that contain a minimum equal amount of pheromone To prevent erratic movements, which can pose challenges in robotic applications, the agent has a probability p of maintaining its current direction when this situation occurs.

ALGORITHME Agent EVAP m = min(QPhero(CellVoisines))

Pour chaque cellule c de CellVoisines

FPour nextCell ← cellule vers laquelle on irait en gardant le cap

Si nextCell appartient à listeVois et random(1) < p Alors allerVers(nextCell)

Pour chaque cellule c de l’env Faire

CLI N G : UN MODÈLE BASÉ SUR LA PROPAGATION D ’ INFORMATIONS

Sempé [16] a proposé un algorithme de patrouille multi-agent qui fait l’hypothèse que les agents sont réactifs (comme dans EVAP) et que l’environnement calcule deux informations :

• la propagation des oisivetés max

A chaque pas de temps, l’environnement calcule l’oisiveté de chaque cellule accessible en incrémentant sa valeur d’une unité L’oisiveté d’une cellule est remise à zéro lorsqu’un agent la visite

The CLInG algorithm (Local Choice based on Global Information) is unique in its approach by incorporating a secondary piece of information through the propagation of maximum idleness This intercellular propagation generates an additional gradient that directs agents toward cells of interest, specifically those that have been visited the longest.

Plus formellement, une cellule i portera une oisiveté propagée OP i en plus de son oisiveté individuelle O i Le gradient formé par l'oisiveté propagée est commun à toute la collectivité, cf

The idleness of a cell is influenced by both the idleness of its neighboring cells and its own individual idleness This relationship can be represented as a utility function that considers both the level of idleness and the presence of agents along the path.

OP i =max i ,max , (eq 2) avec j les cellules voisines de i, et f la fonction de propagation :

( ) i j OP I ( ) j f , = j −α −β Si OP j −α −β I ( ) j ≥ OP min

OP j Sinon α is the propagation coefficient, which can have a significant value (for instance, 30 in experiments) to create a short-distance gradient that prevents all agents from converging at a single maximum point.

The function I intercepts and halts propagation upon encountering an agent, where I(j) equals 1 if an agent is present in cell j, and 0 otherwise This factor also restricts the clustering of agents originating from the same path, with a magnitude order for β being approximately 10, as detailed in reference [16].

The OP min serves as a threshold to ensure that the propagated idleness remains positive and consistently generates a gradient Each agent's behavior is to ascend the maximum idleness gradient, as illustrated in Figure 2 This represents a dual approach to the previous EVAP algorithm, allowing information from neighboring cells to be sourced from more distant cells.

Propagation leverages the inherent characteristics of the environment to convert objective data into subjective information that can be directly utilized by agents The algorithm effectively organizes agents based on the distribution of idleness within the environment.

ALGORITHME Agent CLInG m = max(O_Propagee(CellVoisines))

Pour chaque cellule c de CellVoisines

Pour chaque cellule c de l’env Faire

Pour chaque cellule c de l’env Faire

3 Comparaison les performances entre EVAP et

S IMULATION ET ANALYSE

We examine these two models across five reference environments of increasing complexity, derived or adapted from sources [1] and [18], as illustrated in Figure 3 In each of these environments, the black cells represent obstacles, while the remaining cells are designated for patrolling.

Topology A is an obstacle-free environment allowing for the unrestricted movement of agents Environment B features a spiral design that creates a dead-end corridor topology Environment C introduces constraints through a randomly generated density of obstacle cells, with 20% of the area occupied by obstacles and no isolated free cells Environment D consists of a corridor leading to eight rooms, while Environment E presents six interconnected rooms with nested entrances, generally defining the n-piece problem as n rooms with interlinked access points.

We experimented with algorithms across various populations, systematically doubling the number of agents: 1, 2, 4, 8, 16, 32, and 64 The goal was to evaluate the performance and collective behavior of the models Each simulation was conducted for 3000 iterations (4000 for environments D and E) and repeated 10 times to establish averages.

The optimal theoretical values of idleness are calculated based on the number of accessible cells in the environment, denoted as c An agent moves to a new cell with each iteration, allowing it to visit all cells in c-1 iterations Consequently, the starting cell will achieve an idleness level of c-1.

En effet, les valeurs d’oisiveté sont réparties selon une série linéaire dont la moyenne vaut valeur max / 2

Id le n e s s ( O is iv e té )

EVAP avgIdleness EVAP maxIdleness CLInG avgIdleness CLInG maxIdleness Optimal avgIdleness

Figure 4 : Topologie sans obstacles, 8 agents,

Figure 5 : Topologie sans obstacle, moyenne IGI

Nous présentons brièvement des résultats généralisables à ces trois environnements

In general, the performance of the two algorithms is similar across the three environments Figure 4 illustrates the results obtained with 8 agents in a 20x20 cell environment over the first 1000 iterations, with agents initially positioned randomly This graph displays both the average and maximum idleness for the two studied methods, CLInG and EVAP Notably, the average idleness for both methods stabilizes at a level very close to the theoretical optimum However, the maximum idleness does not stabilize, making it challenging to determine if one method outperforms the other.

Figure 5 illustrates the average idleness of the two methods as the number of agents varies In both scenarios, doubling the number of agents significantly enhances performance For each environment, the observed values closely align with the theoretical optimal value.

We now examine the behavior of two models in complex environments composed of multiple rooms to visit, starting with topology D Figure 6 illustrates the average and maximum idleness performance on map D for a single-agent patrol scenario It is observed that EVAP converges to an extremely stable and nearly optimal performance, while CLInG shows slightly lower performance However, as the number of agents increases, the performance of both methods becomes equivalent and remains close to the theoretical optimal values.

Id le n e s s ( O is iv e té )

EVAP avgIdleness EVAP maxIdleness CLING avgIdleness CLING maxIdleness avgIdleness optimale maxIdleness optimale

Figure 6 : Topologie couloir-salles, 1 agent, 4000 itération

Figure 7 clearly illustrates two distinct phases for EVAP Up to iteration 830, both the average and maximum idleness levels are significantly higher than those of CLInG This can be attributed to agents initially undergoing an exploration phase, where they access the rooms for the first time.

The second phase, which involves revisiting the rooms, is more effectively managed as the pheromone directly guides the agents to the most remote areas.

Id le n e s s ( O is iv e té )

EVAP avgIdle EVAP maxIdle CLInG avgIdle CLInG maxIdle avgIdle Optimal maxIdle Optimal

Figure 7 : Topologie 6-pièces, 4 agents, 2000 itérations

Figure 8 : Topologie 6- pièces, moyenne IGI

The challenge for EVAP arises at the doors separating two rooms An agent may continue its exploration without revisiting this door, leading to the oversight of the unvisited room This issue, identified by Wagner, stems from a local perspective when choosing between two equally significant nodes.

Figure 9 : EVAP et CLInG, Map E

CLInG effectively avoids the issue of unvisited areas by generating a strong allure towards unexplored rooms, ensuring that the agent is drawn to these spaces when approaching a door This propagation process provides the agent with a broader perspective of its environment, allowing for optimal access to rooms right from the initial exploration.

D ISCUSSION

Complexité

We have identified topologies in CLInG that enhance information propagation efficiency However, this process incurs a cost The complexity difference in algorithms does not stem from agent behavior, which remains consistent, but rather from the calculations performed by the environment at each time step Specifically, for c cells containing pheromone, it is necessary to

In an evaporation environment with n cells on each side, the maximum number of evaporation operations required is n² However, the CLInG environment is significantly more resource-intensive, necessitating n² idleness calculations and n² propagation operations, resulting in a total of 2n² operations Consequently, CLInG proves to be twice as costly in terms of execution time compared to the basic evaporation operations.

Exploration et patrouille

Simulations conducted in complex environments have revealed two operational regimes, particularly for EVAP Initially, the system undergoes an exploration phase, which is followed by a sudden transition to a more stable and efficient behavior A similar initial phase is observed for CLInG; however, its duration is generally shorter due to the attraction exerted by unexplored areas.

Avantages et défauts des méthodes

Overall, the performance of both algorithms is close to the theoretical optimum Our findings indicate that in moderately complex environments, the deposition of pheromones is adequate to ensure low average idleness However, when focusing on maximum instantaneous idleness (or worst-case idleness), CLInG generally outperforms due to the attraction exerted by cells with higher idleness.

One of the surprising findings of this study is the achievement of strong, even optimal, performance with single-agent patrols This indicates that environmental marking processes can effectively address single-agent problems while ensuring scalability.

Our study demonstrates that CLInG outperforms EVAP in complex environments with interlocking rooms, especially when the number of agents is low A more detailed examination and discussion of these two algorithms can be found in [5], which we prepared during this internship (see Annexes).

4 Problème d’énergie dans la patrouille

L IMITATION ÉNERGÉTIQUE

One of the key applications of multi-agent systems is in multi-robot systems Robots, like any machine, consume energy and require a renewable energy source that must be replenished regularly, either automatically or partially manually In autonomous robotics, energy maintenance is a critical requirement for developing robust systems.

An autonomous robot, regardless of the sophistication of its artificial intelligence, will be limited in its lifespan and workload by the available energy This fundamental issue is common to all living beings and can impose intriguing constraints on the design of intelligent autonomous systems However, several challenges remain to be addressed in tackling this problem.

Energy autonomy for mobile robots is defined by two key capabilities: self-sufficiency and self-powering Specifically, an autonomous robot must be able to recharge itself and possess the ability to operate independently.

• trouver des stations de recharge

• se rendre compte de son besoin de recharge

Other capabilities, such as interaction and connection to a charging station or sharing a station with other robots, although more aligned with pure robotics than artificial intelligence, should also be taken into account.

The energy issue has been explored extensively in recent years, primarily from a physical perspective To date, there has been no research paper that focuses solely on the problem of patrolling with energy constraints.

The conventional approach to addressing this issue involves placing a charging device in a fixed location frequently visited by agents Various studies have been proposed and discussed based on this method, focusing primarily on the physical design, installation of the charging system, and the interaction between robots and the station However, these studies often overlook the problem of locating the charging stations.

The issue of station sharing has been rarely addressed in research Munoz-Molendez et al designed a group of self-sufficient mobile robots capable of efficiently sharing a charging station using simple mechanisms that do not require communication Additionally, Sempé proposed an information propagation-based mechanism that assists robots in locating and sharing stations The challenge of estimating self-sufficiency in real robots is more common, with Gérard presenting an estimation method based on neural networks.

Par contre, une approche alternative pour le problème de l'énergie a été proposée dans

[21]par Zebrowski et Vaughan : un robot ô tanker ằ qui porte l’ộnergie et la distribue à plusieurs robots travailleurs Le tanker a pour unique tâche de chercher et recharger les robots travailleurs

L’étude du problème énergétique est indispensable pour le problème de la patrouille multi-robots Dans ce cadre, un agent/robot doit avoir la capacité de :

• Découvrir des stations de recharge (en particulier dans un environnement inconnu)

• Estimer le temps d’activité restant avant que l’énergie soit épuisée

• Prendre la décision d’aller se recharger avant épuisement

• Trouver un chemin pour retourner aux stations de recharge

Dans les sections suivantes, nous présentons nos deux modèles basés sur les deux approches mentionnées ci-dessous.

MARKA : UN MODÈLE COLLECTIF BASÉ SUR LA CONSTRUCTION DE CHAMP NUMÉRIQUE POTENTIEL

Comportement des agents

Agents operate within a single cell and can only perceive the four adjacent cells The concept is that during each time step of the patrol, in addition to the digital pheromone, agents assign a calculated value to each visited cell based on a classical equation that incorporates markings from neighboring cells.

= val val cellulesVo isinage val (eq 3)

Figure 10 : Le processus de marquage d’environnement

The value at the charging station is zero, indicating that these numerical fields represent the shortest distance from the charging station to the cells with these values This information directs agents to the nearest charging station.

Figure 11 illustrates the formation of a digital field surrounding a charging station When agents require recharging, they will engage in a gradient descent, moving towards the cells with the smallest markers.

Figure 11 : La formation de gradient des champs numériques

Algorithme

L’algorithme d’un agent de MARKA se divise en 2 phases principales : une ô Phase de Patrouille ằ et une ô Phase de Recherche d’ộnergie ằ

Patrouiller m min(Marque(Voisinage)) mc Marque(Cellule courante)

Une fois que l’agent a besoin d’énergie, il passe à la Phase de Recherche d’énergie : une descente du gradient de marque

Si station de recharge est dans Voisinage Alors

FSi FPour nextCell cellule aléatoire dans ListeVois

Estimation de l’autosuffisance

One notable feature of MARKA is its ability to enable an agent to estimate the required energy to reach a charging station before depletion By leveraging environmental markings, the agent can determine its relative distance to the nearest charging station, thereby estimating whether its remaining energy is sufficient to return to the station.

In a dynamic system, energy decreases over time, represented by e_d, while e_c denotes the remaining energy The current cell is marked as m, and sc indicates the preventive time step At each time step, the agent evaluates whether the condition e_c - e_d ≤ sc * m is met If this condition is satisfied, the agent transitions to the Energy Search Phase.

TANKER : UNE APPROCHE AUTO - ORGANISÉE COLLECTIVE POUR L ’ OPTIMISATION DE

Les forces attractives et répulsives

From an agent's perspective, the attractive force is defined as the force that pulls the tanker towards it This force is represented by a vector whose intensity is proportional to the agent's demand weight and the distance between the agent and the tanker Specifically, for a tanker A that perceives a demand D with weight WD, the attractive force can be systematically analyzed.

The influence of attraction decreases as the Tanker moves closer to the agent In this context, W D is equal to 1, indicating that the roles and demands of all agents are equivalent For a group of agents, the influence on the Tanker is defined as the sum of the forces exerted by each agent.

/ 1 n est le nombre d’agents perỗu par le Tanker A par le rayon d’attraction r a (Figure 12)

En conséquence le Tanker se déplace au barycentre des demandes, qui est la distance moyenne minimale à ses points

Figure 12 : Attraction guide le Tanker au barycentre des demandes

Dans le cas ó plusieurs Tankers appliquent un tel comportement, certains d’entre eux peuvent se déplacer à la même position Pour éviter ce problème, on utilise une force répulsive

L’intensité de la force répulsive est inversement proportionnelle à la distance entre les Tankers La Figure 13 illustre le processus de répulsion entre 2 Tankers

Figure 13 : Répulsion garde la distance entre Tankers A et B

Dans ce contexte, nous utilisons la formule suivante pour exprimer la force répulsive exercée par un Tanker B sur un Tanker A :

Donc, la force répulsive que le Tanker A doit subir est calculée comme suit (m étant le nombre des Tankers perỗu par le Tanker A par le rayon de rộpulsion r r ):

Le comportement final d’un Tanker est défini comme une somme des forces attractives et répulsives

Comportement du modèle (algorithme)

Si status(Tanker) = calcul Alors

Calculer l’attraction Calculer la répulsion Calculer le vecteur Move status(Tanker) deplace

Se déplacer à la cellule suivante à la direction de Move Déposer Phéromones Q 0

Le comportement des Tankers consiste en 2 tâches : le calcul des forces pour optimiser leur position et le dépôt des phéromones pour attirer les agents

Dans notre contexte, la vitesse de travail d’un Tanker est plus lente (plus précisément,

In our experiments, the performance of agents was found to be three times slower than that of a working agent This modification results in stable performance, as the agents in our context are not stationary, but rather move quickly.

The environment facilitates the diffusion of pheromones released by tankers This diffusion process spreads a portion of the initially deposited pheromone from the source cell, creating circular equipotential lines centered around the original deposit site.

Figure 14 : Diffusion en environnement discret

Soit q i , la quantité de phéromone présente sur la cellule i, n le nombre de cellules voisines et coefDiff, le coefficient de diffusion La loi d’évolution de q i est :

La diffusion des phéromones progresse d’une cellule par itération Au bout de k itérations, ce processus aura progressé de k cellules

The behavior of an agent is categorized into two main phases: the "Patrol Phase" and the "Energy Search Phase." In our experiments, we focus on the Energy Search Phase, which involves an increase in the gradient of pheromones released into the environment.

Si Tanker est dans Voisinage Alors

Déplacer au Tanker pour recharger

Pour chaque cellule c de Voisinage

FSi FPour nextCell cellule aléatoire dans ListeVois

5 Performances de MARKA et TANKER

Simulations were conducted using the MadKit simulator (www.madkit.org) The agents possess a finite amount of energy that decreases with each action taken We analyze these two models in a 20x20 cell environment without obstacles, under two different contexts.

- Tâche simple: un groupe d’agents travaille dans l’environnement

- Tâche dynamique : multi-groupes d’agents travaillent dans différents régions dans l’environnement

Nous utilisons les critères suivants pour évaluer la performance des deux modèles :

• Recharge Time : nombre de fois ó l'agent est allé se recharger

• Total Step : temps total pour aller se recharger

• Average Step : temps moyen pour aller se recharger.

T ÂCHE SIMPLE

Figure 15 : MARKA et TANKER, 4 agents, 4000 itérations

Figure 15 illustrates the performance achieved with four agents and one charging station/one tanker over 4,000 iterations, with agents initially positioned randomly This graph depicts the total time step, recharge time, and average time step for both methods studied We select the central point of the environment as the charging station position for MARKA and the initial position for the tanker, as this represents the optimal point.

When it comes to charging time, the MARKA algorithm shows a slight advantage, as its agents can assess whether the remaining energy is sufficient to return to the charging station However, on average, both MARKA and TANKER demonstrate similar efficiency Notably, the TANKER algorithm maintains its position at the centroid of the working agents, which is considered the optimal location.

T ÂCHE DYNAMIQUE

We are now examining the behavior of the two models in a different work context, where the environment consists of multiple regions to explore, and several groups of agents operate in various areas Each group is assigned to a specific region, and at certain points during the simulation, the agents transition between these regions.

We evaluated the performance of two methods using one and two groups of agents, each consisting of four agents that were initially positioned randomly The number of charging stations/tankers matched the number of groups, with their initial placement illustrated in Figure 16.

Figure 17 : MARKA et TANKER, 2 groupes, 4000 itérations

Figure 17 illustrates the performance of two groups of agents over 4000 iterations The TANKER algorithm outperforms MARKA in terms of total time step, with a more significant advantage observed in the average time step This enhanced performance can be attributed to the tracking capabilities of the tanker agents.

The tanker initially moves towards the barycenter of the agents, which represents the equilibrium position of their attractive forces As it transitions from one region to another, the attractive forces exerted by the agents draw the tanker closer Consequently, the tanker aligns itself with the barycenter of the positions again Additionally, it is noted that there is no redundancy in the positions of the tankers, as they distribute themselves within groups due to the repulsive forces they exert on one another.

A VANTAGES ET DÉFAUTS DES MODÈLES

MARKA is an effective solution for energy challenges, particularly with its ability to estimate agent self-sufficiency, which is crucial in addressing these issues Additionally, this algorithm can operate in environments with obstacles However, it still faces challenges regarding the discovery of charging stations.

TANKER outperforms MARKA in dynamic work scenarios involving multiple groups across various regions A key feature of TANKER is its ability to track agents, evolve towards optimal positions, and effectively distribute tankers among agent groups However, TANKER's limitation lies in its low adaptability to complex environments with obstacles such as walls.

This internship focused on studying the collective intelligence approach to the multi-agent patrolling problem in an unknown environment Additionally, it aimed to incorporate energy constraints into the patrolling challenge and to propose an algorithm that enables agents to coordinate their patrolling and recharging activities effectively.

Nous avons effectué des tests comparatifs entre le modèle CLInG [16] et EVAP Cette étude expérimentale a montré l’intérêt d’une approche par l’intelligence collective :

• Simple mais efficace : Le comportement des agents est simple, mais les résultats obtenus sont très étonnants

• Convergence vers une performance stable

• Robustesse : Le système est capable de se réorganiser de lui-même pour s’adapter aux différentes configurations de la patrouille

Cette étude est présentée plus en détail dans un article accepté à la conférence IEEE ICTAI 2007 (International Conference on Tools with Artificial Intelligence)

We have introduced two models, MARKA and TANKER, to address the energy-limited patrol problem The impressive performance demonstrated by TANKER indicates its effectiveness for the dynamic and multi-group version of the energy supply issue These results serve as validation for the thesis of Moujahed et al.

One of the main objectives of continuing the work on the TANKER and MARKA models is to conduct real-world experiments using WIFIBots However, transitioning from simulation to reality remains a significant challenge It is essential to further investigate the robustness and adaptability of TANKER and MARKA in the face of disturbances to evaluate the robots'

Both the MARKA and TANKER models demonstrate suitable characteristics for the SCOUT project (Survey of Catastrophes and Observation of Urban Territories) in Vietnam Therefore, it is essential to continue studying the adaptation of these models to fit the project's context.

[1] A Almeida, G Ramalho, H Santana, P Tedesco, T Menezes, V Corruble,

Y Chevaleyre, Recent Advances on Multi-Agent Patrolling, Proceedings of the 17th Brazilian Symposium on Artificial Intelligence, pp.474 – 483, 2004

[2] E Bonabeau, M Dorigo, G Theraulaz, Swarm Intelligence: From Natural to

Artificial Systems, Oxford University Press, 1999

[3] Y Chevaleyre, Le Problème Multiagent de la Patrouille, In Annales du

LAMSADE No 4, 2005 http://www.lamsade.dauphine.fr/~chevaley/papers/anna_patro.pdf

[4] Y Chevaleyre, Theoretical analysis of multi-agent patrolling problem,

Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology, pp.302 - 308, 2004

[5] H.N Chu, A Glad, O Simonin, F Sempé, A Drogoul, F Charpillet, Swarm approches for the patrolling problem, information propagation vs pheromone evaporation, IEEE International Conference on Tools with

[6] O Gérard, J-N Patillon, F d’Alché-Buc, Discharge Prediction of Rechargeable Batteries with Neural Networks, in Integrated Computer-

[7] F Lauri, F Charpillet, Ant Colony Optimization applied to the Multi-Agent Patrolling Problem, IEEE Swarm Intelligence Symposium, 2006

[8] K Kouzoubov, D Austin, Autonomous Recharging for Mobile Robotics,

Proceedings of Australian Conference on Robotics and Automation, 2002

[9] A Machado, G Ramalho, J.-D Zucker, A Drogoul, Multi-Agent Patrolling: an Empirical Analysis of Alternative Architectures, Proceedings of Multi- Agent Based Simulation, pp.155 – 170, 2002

[10] A Machado, A Almeida, G Ramaldo, J.-D Zucker, A Drogoul, Multi- Agent Movement Coordination in Patrolling, In 3 rd International Conference on Computers and Games, 2002

[11] A Muủoz-Melộndez, F Sempộ, A Drogoul, Sharing a charging station without explicit communication in collective robotics, Proceedings of the 7 th

International Conference on Simulation of Adaptive Behavior on From Animals to Animals, pp 383 – 384, 2002

[12] S Moujahed, O Simonin, A Koukam, Location Problems Optimization by a

Self-Organizing Multiagent Approach, in MAGS International Journal on

Multigent and Grid Systems (IOS Press), Special Issue on Engineering Environments For Multiagent Systems, 2007

[13] L Panait, S Luke, A pheromone-based utility model for collaborative foraging, Proceedings of the International Conference on Autonomous

Agents and Multiagent Systems (AAMAS), pp.36 – 43, 2004

[14] H.V Parunak, M Purcell, R O’Connell, Digital Pheromones for Autonomous coordination of swarming UAV’s, Proceedings of AIAA First

Technical Conference and Workshop on Unmanned Aerospace Vehicles, Systems, and Operations, 2002

[15] H.Santana, G.Ramalho, V Corruble, R Bohdana Multi-Agent Patrolling with Reinforcement Learning, Proceedings of the 3 rd International Joint Conference on Autonomous Agents and Multi-Agent Systems, pp.1122-

[16] F Sempé, Auto-organisation d'une collectivité de robots: application à l'activité de patrouille en présence de perturbations, PhD Thesis Université

[17] F Sempé, A Drogoul, Adaptive Patrol for a Group of Robots, Proceedings of the 2003 IEEE/RSJ, Intelligence Robots and Systems, Las Vegas, Nevada, pp 2865 – 2869, 2003

[18] M Silverman, D M Nies, B Jung, and G S Sukhatme, Staying alive: A docking station for autonomous robot recharging, Proceedings of the IEEE

International Conference on Robotics and Automation, Washington D.C., pp

[19] O Simonin, F Charpillet, E Thierry, Collective construction of numerical

[20] I.A Wagner, M Lindenbaum, A.M Bruckstein, Distributed Covering by Ant-Robots using Evaporating Traces, IEEE Transactions on Robotics and

[21] P Zebrowski, R T Vaughan, Recharging Robot Teams: A Tanker Approach, Proceedings of the International Conference on Advanced

Swarm Approaches for the Patrolling Problem, Information Propagation vs Pheromone Evaporation

Hoang-Nam Chu 1,2 , Arnaud Glad 1 , Olivier Simonin 1 , Franỗois Sempộ 2 ,

1 MAIA, INRIA Lorraine, Campus scientifique, BP 239, 54506 Vandœuvre-lès-Nancy, France

2 Institut Francophone pour l’Informatique, Hanọ, Vietnam

3 IRD - Institut de Recherche pour le Développement, Bondy, France {hoangnam.chu, arnaud.glad, olivier.simonin, francois.charpillet}@loria.fr francois@ifi.edu.vn, drogoul@mac.com

This paper addresses the multi-agent patrolling problem in unknown environments through two collective approaches that leverage environmental dynamics We first establish performance criteria and introduce an algorithm based on the evaporation of pheromones emitted by reactive agents (EVAP) Additionally, we present the CLInG model, proposed in 2003, which incorporates the diffusion of area idleness Through systematic simulations, we compare the performance of these two models in increasingly complex environments Our analysis is further enhanced by comparing the results with theoretical optimal performance, enabling us to identify the topologies best suited for each method.

Keywords: Multi-agent patrolling, reactive multi- agents system, digital pheromones

Patrolling consists in deploying a set of agents

(robots) in an environment in order to visit regularly all the accessible places [5]

Recent studies have explored the problem using centralized, heuristic, and distributed approaches, primarily within a discrete environment represented as a graph, where vertices signify predetermined locations and edges denote valid paths Various graph search algorithms, often inspired by the traveling salesman problem, have been proposed, including a solution by Lauri and Charpillet that utilizes Ant Colony Optimization (ACO) algorithms, which also relies on a graph representation Additionally, some methods employ learning techniques to compute an optimal multi-agent path offline, which is executed online in the given environment However, these solutions lack the ability to adapt to real-time changes in the environment, such as fluctuations in the number of agents or the movement of obstacles.

Moreover, these approaches are subject to combinatory explosion when the graph size becomes important (several hundreds of nodes) or when the number of deployed agents increases

However, nowadays, many concrete applications present the patrolling problem on large spaces, known or unknown, with a significant number of agents

(drones deployed to supervise a strategic place, patrolling of buildings by mobile robots, etc.)

To address the challenges posed by unknown environments, swarm intelligence presents a promising solution This approach relies on environmental marking, drawing inspiration from the pheromone trails left by ants, which facilitates indirect communication and coordination among agents.

Digital pheromones operate through two key processes influenced by the environment: diffusion and evaporation The diffusion process facilitates the spread of information based on proximity, while evaporation gradually diminishes the quantity of information over time.

In 2003, Sempé et al introduced an algorithm called CLInG, which leverages information propagation akin to a diffusion process, highlighting the benefits of an active environment approach However, this method is considered relatively costly due to its reliance on propagation processes and idleness evaluation To address these concerns, we present a new algorithm, the EVAP model, which is based solely on the evaporation of digital pheromones deposited by agents.

This article aims to compare two collective techniques that operate within an active environment, highlighting their differing complexities By doing so, we seek to enhance our understanding of the functioning and performance of these stigmergic principles.

This article is structured to first define the patrolling problem and establish performance criteria in Section 2 Section 3 introduces the EVAP model, while Section 4 discusses the CLInG model In Section 5, we conduct experiments comparing both models in increasingly complex environments Finally, Section 6 synthesizes the results to highlight the advantages of each approach.

Finally this work ends with a conclusion and presents some perspectives

2 The multi-agent patrolling problem

Patrolling involves the strategic deployment of a set number of agents to periodically monitor key locations within an area The primary objectives include gathering reliable information, searching for specific objects, and ensuring the security of these places against potential intrusions.

To ensure effective patrols in dynamic environments, it is crucial to minimize the delay between consecutive visits to the same location Previous research on multi-agent patrol strategies assumes a known, two-dimensional environment that can be represented as a graph G (V, E), where V denotes the set of nodes to be visited and E represents the arcs that define valid paths between these nodes.

Ngày đăng: 03/07/2022, 08:41

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN