Since, everything done on behalf of your instance is logged, detecting if you have a large number of bots, or invalid users isn’t that challenging.

These queries can be executed via docker exec -it, via remoting into the container, via pg query tools, or via pgadmin.

For listing all comments performed by users on your instance (This includes comments made remotely):

SELECT
	p.actor_id
	, p.name
	, c.content as comment
FROM public.comment c 
JOIN public.person p on p.id = c.creator_id
WHERE 
	p.local = 'true'
	AND p.admin = 'false' -- Exclude Admins
;

For listing all posts created, by users, from your instance-

SELECT
	p.actor_id
	, c.name AS title
	, c.body as body
FROM public.post c 
JOIN public.person p on p.id = c.creator_id
WHERE 
	p.local = 'true'
	AND p.admin = 'false' -- Exclude Admins
;

Lastly, here is a query to identify users who consistently upvotes or downvotes the same user over and over.

SELECT
	p.id
	, p.name
	, p.actor_id
	, cr.name as creator
	, count(1)
FROM public.comment_like l
JOIN public.comment c on c.id = l.comment_id
JOIN public.person p on p.id = l.person_id
JOIN public.person cr on cr.id = c.creator_id
WHERE 
	p.id != cr.id
	AND p.local = 'true'
	AND p.admin = 'false' -- Exclude Admins
GROUP BY p.id, p.name, p.actor_id, cr.name
ORDER BY count(1) desc
;

If- anyone has idea of other queries which can be created for detecting suspicious activity, please LMK.

Edit- added where clause to exclude admins. If your admins are spambots, you have bigger issues to worry about.

  • HTTP_404_NotFound@lemmyonline.comOP
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    1 year ago

    Remember- any federated content is also stored on your database- Part of how this platform works.

    If- you don’t have much disk space- you might consider joining a larger instance.

    (Also, you CAN clean up the activity table daily too)

    Edit- I do have a kubernetes CRD which handles automatically purging the activity table, for data older than a couple days.